GenCore version 5.1.6 
Copyright (c) 1993 - 2 004 Compugen 



OM protein - protein search, using sw model 



Run on: 



January 7, 2004, 16:44:17 ; Search time 44 Seconds 

(without alignments) 
1215.701 Million cell updates/sec 



Title: US -10-088-872-2 

Sequefce! 00 ^ l^MKKMPLFSKSHKNPAEIVKI FADEKNYLI KQ IRDLKKTAP 337 

Scoring table: BLOSUM6 2 

Gapop 10.0 , Gapext 0.5 

Searched: 1107863 seqs , 158726573 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 
Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



1107863 



Database 



A_Geneseq_19Jun03 : * 



1 : 

2 : 

3 : 

4 : 

5 : 
6: 

7 : 

8 : 
9: 
10 
11 
12 
13 
14 
15 
16 
17 
18 
19 
20 
21 
22 
23 
24 



/SIDSl/gcgdata/geneseq/geneseqp-embl/AA1980.DAT 
/SIDSl/gcgdata/geneseq/geneseqp-embl/AA1981.DAT 
/SIDSl/gcgdata/geneseq/geneseqp-embl/AA1982.DAT 
/SIDSl/gcgdata/geneseq/geneseqp-embl/AA1983.DAT 
/SIDSl/gcgdata/geneseq/geneseqp-embl/AA1984.DAT 
/SIDSl/gcgdata/geneseq/geneseqp-embl/AA1985.DAT 
/SIDSl/gcgdata/geneseq/geneseqp-embl/AA1986.DAT 
/SIDSl/gcgdata/geneseq/geneseqp-embl/AA1987.DAT 
/SIDSl/gcgdata/geneseq/geneseqp-embl/AA1988.DAT 
/SIDSl/gcgdata/geneseq/geneseqp-embl/AA1989.DAT 
/SIDSl/gcgdata/geneseq/geneseqp-embl/AA1990.DAT 
/SIDSl/gcgdata/geneseq/geneseqp-embl/AA1991.DAT 
/SIDSl/gcgdata/geneseq/geneseqp-embl/AA1992.DAT 
/SIDSl/gcgdata/geneseq/geneseqp-embl/AA1993.DAT 
/SIDSl/gcgdata/geneseq/geneseqp-embl/AA1994.DAT 
/SIDSl/gcgdata/geneseq/geneseqp-embl/AA1995.DAT 
/SIDSl/gcgdata/geneseq/geneseqp-embl/AA1996 . DAT 
/SIDSl/gcgdata/geneseq/geneseqp-embl/AA1997.DAT 
/SIDSl/gcgdata/geneseq/geneseqp-embl/AA1998 .DAT 
/SIDSl/gcgdata/geneseq/geneseqp-embl/AA1999.DAT 
/SIDSl/gcgdata/geneseq/geneseqp-embl/AA2000.DAT 
/SIDSl/gcgdata/geneseq/geneseqp-embl/AA2001.DAT 
/SIDSl/gcgdata/geneseq/geneseqp-embl/AA2002.DAT 
/SIDSl/gcgdata/geneseq/geneseqp-embl/AA2003.DAT 



Pred No is the number of results predicted by chance to have a 
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RESULT 1 

^^94247 standard; protein; 337 AA. 
XX 

AC AAY94247; 
XX 

DT io-AUG-2000 (first entry) 

XX 
DE 



Human calcium binding protein hCBP . 
XX 

OS Homo sapiens. 
XX 

PN WO200029580-A1. 
XX 

PD 25-MAY-2000. 

p F 12-NOV-1999; 99WO-US27027 . 
XX 

PR 13-NOV-1998; 98US-0190965 . 
XX 

p A (INCY-) INCYTE PHARM INC. 

XX ror lev NC, Gorgone GA; 

PI Tang YT, Guegler KJ, Coney ^, 

XX 

DR WPI; 2000-387793/33. 

DR N-PSDB; AAA27332. 

XX . ,. he nU cleic acid encoding it, useful for e.g. 

PT or reproductive disorders - 

PS Claim 1; Fig 1; 72 PP ; English. 

S Tne present seguence is the J^^^^^S cSa library, 

CC was obtained by greening coronar y art isolated , sequenced and 

cc fr0 m which five --^^rThe protein and the gene encoding it are 

CC expressed to give the P«*ein. i P following types of 

CC useful for the diagno^ and «^ment reproductive disorders 

CC disorder: cancers (such as f^JJ endometriosis, disruptions of 

cc (such as infertility, -ulatory defects and varian 

CC the oestrus and menstrual cycl P° Y * beni pros tatic 

CC hyperstimulation) , au disorders (such as 

cc hyperplasia and prostatites) , ^ £ gona dal dysgenesis) , 

CC Gushing' s syndrome, muscular dyst rophy a g disorde rs (such as 

CC hereditary neuropathies seizure disorders cholecyst itis , Crohn's 

CC AIDS, allergies, anaemia ' , a ^ ase mult iple sclerosis, psoriasis 

CC disease, diabetes, Graves- ^^f e ' s ^en's syndrome and ulcerative 

CC rheumatoid ^1-^^-^^^. prot ozoal and 

CC colitis) , ana virdi, 

CC helminthic infections. 



XX 

SQ Sequence 337 AA; 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



100.0%; Score 1704; DB 21; Length 337; 
Query Match d No i.3e-146; 



QY | | || I I I M M I M _M ' ; - nr.KKTAP 3 

Db 301 LIEFLS5 



I I M | I I | | I M M M I M ii 



'^39078 standard; Protein; 337 AA. 



RESULT 2 
AAM39078 
ID 
XX 

AC AAM39078; 
XX 

DT 22-OCT-2001 (first entry) 
S Human polypeptide SEQ ID NO 2223. 

XX 
KW 

KW 
KW 
KW 
KW 

KW leukaemia . 
XX 

OS Homo sapiens. 

XX 

PN WO200153312-A1. 
XX 

PD 26-JUL-2001. 

yv 

p F 26-DEC-2000; 2000WO-US34263 . 
XX 

p R 21-JAN-2000; 2000US-0488725 - 

PR 25-APR-2000; 2000US-0552317 . 

PR 09-JUL-2000; 2 000US- 0598042 . 



Human poiypep'-^^ ~— 

^rt-natatic- qene therapy; cancer; 
« nootropic; i™^^"f t i y 7«ftf a l nervous system; CS; 

• rttelt1 -' infla ™ ation " 



PR 19-JUL-2000; 2000US-06203 12 

p R Q3-AUG-2000; 2000US-0653450 

H 14-SEP-2000; 2000US-0662191 

PR 19-OCT-2000; 2000US-0693036 

PR 29-NOV-2000; 2000US- 0727344 

xx 

p A (HYSE-) HYSEQ INC 



PI 

PI Zhao QA, Zhou P, 
XX 

DR WPI; 2001-442253/47. 

DR N^PSDB; AAI58234. 

XX 
PT 
PT 
XX 
PS 

XX 

CC 

CC 

CC 



IM-fOUiJ/ cm*— — 

ncseful for treating disorders 
Novel nucleic acids and ^eptid^ useful 
such as central nervous system juries 

Example 4; SEQ ID NO 2223; 10078pp; English. 

1 a ^*Q (AAI57798-AAI61369) and 

The invention relates to ^ - - « ^ nootropic, 
the encoded polypeptid es (MM ^Jivity. The polynucleotides are useful 
c u immunosuppressant and ^J^" gaining a polypeptide or polynucleotide 
CC in gene therapy. A co^posit on co n P Y ^ ipheral nervous 

CC of "the invention ^ use to treat dis h£ral neuropa thy and 

CC system, such as peripheral nervous J diseases, such as 

CC localised neuropathies and central nervou^^^ 

cc Alzheimer's, Parkinson * yndrc l. Other uses include the 

CC lateral sclerosis, and Shy Drag y Immune sys tem suppression, 

CC utilisation of the activities such as " ^ £ ic activity , haemostatic 

CC Activin/inhibin activity ^£ c £^ iB and therapy, drug screening 

CC and thrombolytic activity cancer dig inf lamma tion, leukaemias and 

CC assays for receptor activity, artnn 

CC C.N.S disorders. did not form part of the printed 

CC Note: The sequence data tor cm y 
CC specification. 
XX 

SQ Sequence 3 37 AA; 

100.0%; Score 1704; DB 22; Length 337; 
Query Match , N i.3e-146; 

Best Local Similarity 100.0,, Pred^ ^ ^ Q; Gaps 0 , 



Matches 337; Conservative 

Qy 

Db 

QY 
Db 



oy 

Db 

Qy 



• Conservative u; 



Db 181 SDAFATFKDliLTRHKVLVADFLEQNYDTIFEDYEKIiLQSENYVTKRQSLKLLGELILDRH 240 

241 NFAIMTKYISKPEMLKLMMNLLRDKSP^ 300 

II ill! iUiiU™ 300 



Qy 

Db 

Qy 

Db 



241 NF AIMTKY I S KPENLKLMMNLLRDKS PIS 
301 L I EFL S S FQKERTDDEQF ADEKNYL I KQ IRDLKKT A.P 337 
301 LIEFLSSFQKERTDDEQFADEKNYLIKQIRDLKKTAP 337 



RESULT 3 

AAB82090 , 
ID AAB82090 standard; Protein; 337 AA. 

XX 

AC AAB82090; 
XX 

DT 26-JUN-2001 (first entry) 

11 Human Acute Neuronal Induced Calcium Binding Protein,. ANIC-BP. 

XX 
KW 
KW 
KW 
XX 

OS .Homo sapiens. 
XX 

PN WO200123552-A1. 
XX 

PD 05-APR-2001. 
XX 

PF 18-SEP-2000; 2 0 00WO-EP0 9132 . 
XX 

PR 24-SEP-1999; 99EP-0118848 . 

XX ■ ' ■'' 

PA (MERE ) MERCK PATENT GMBH. 

XX 

PI Den Daas I , Duecker K; 
XX 

DR WPI; 2001-308142/32. 
DR N-PSDB; AAF86462 . 

' ' acute head trauma, multiple sclerosis and spinal cord injury 
Claim 1; Page 41-42; 45pp; English. 

Th e preset sequence is ^^1" ^^"^-."L 
mdue.d calcju™ Binding in WTO BP. » ^ ^ 

protein are useful for g Bp coaing sequence and protein 

a C « SU"JSuT ari,ecIneeX y induei„ g an i-u„olo g i=al response in a 



PT 

XX 
PS 
XX 
CC 
CC 
CC 
CC 
CC 

CC mammal 
XX 

SQ Sequence 337 AA; 



100.0%; Score 1704; DB 22; Length 337; 
Query Match l.3e-146; 
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1 I 


Db 
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ov 


61 


Db 


61 


Ov 


121 


Db 


121 


Qy 


181 


Db 


181 


Qy 


241 


Db 


241 


Qy 


301 


Db 


301 



LEKQD KKTD Mb t^v^rvo aux^^ 

,• , r -r r» -i n 



II 



121 AHFiiiurnjjuiw*— ~ — 

^ T tTTiTT t iatadv ^nn 
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26-JUN-2001 (first entry) 
Human protein sequence SEQ ID NO: 14408. 

Human; primer; detection; diagnosis; ant i sense therapy; gene therapy. 



RESULT 4 

AAB94139 , nA 

ID AAB94139 standard; Protein; 289 AA. 

XX 

AC AAB94139; 
XX 
DT 
XX 
DE 
XX 
KW 
XX 

OS Homo sapiens 
XX 

PN EP1074617-A2 
XX 

PD 07-FEB-2001- 
XX 
PF 
XX 
PR 
PR 
PR 
PR 
PR 
XX 

PA (HELI-) HELIX RES INST . 
XX 



28- JUL-2000; 2000EP- 0116126 

29- JUL-1999; 99JP-0248036 
27-AUG-1999; 99JP-0300253 
ll-JAN-2000; 2000JP-0118776 
02-MAY-2000; 2000JP-0183767 
09-JUN-2000; 2000JP-0241899 



PI 
PI 

XX 
DR 
XX 
PT 
PT 
PT 
PT 
XX 
PS 
XX 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
XX 
SQ 



■ v qaito K, Yamamoto J; 

WPI; 2001-318749/34- 

full-length cDNAs - 

n • 8 SEQ ID 14408; 2537pp + CD ROM; English. 
Claim 8, SEQ synthe sising 5602 

t „ the ^^f^^ee defined in " M> "oSnetion 

ro^:^^^^^ 

Jiigonecleotide eomprise. « ^ ^ eelected from Jhose defied ^ 

represent oligonucleotides, 
of the present invention. 



Sequence 



289 AA; 



86.0%; Score 1466; DB 22 

Query Match 99 7% • Pred. No. 4.6e-12b 

Best Local Similarity • » Mismatches ! 
Matches 288; Conservative 



Length 2 89; 

indels 0; Gaps 0-; 



QY 
Db 

QY 
Db 

QY 
Db 

Qy 

Db 



oaf*. Conservative ^, 

2 ' rvx Tn-R-rTCKDVTOIFNNILRRQ 108 

:SS SSS»S" 



Qy 



289 PIVEILLKNQPKLIEFLSSFQKERTD^ 337 



II 



Db 241 iiiElL^ip^iEFLSSFQKERTDDEQFADEKNYLIKQIRDLKKTAP 289 



RESULT 5 
AAB48970 

ID AAB48970 standard; Protein; 341 AA. 
XX 

AC AAB48970; 
XX 
DT 
XX 
DE 
XX 



27-MAR-2001 (first entry) 

Human ANIC-BP (acute neuronal induced calcium-binding protein) . 

neuronal induced calcium-binding protein; ANIC-BP; 

KW acute head trauma; multiple sclerose, y 
cerebroprotective; neuroprotective. 



KW 
XX 

OS Homo sapiens . 
XX 

PN WO200078947-A1 . 

XX . . 

PD 28-DEC-2000. 
XX 

PF 14-JUN-2000; 2000WO-EP0B457 . 

XX 

PR 22-JUN-1999; 99EP-0112024 . 

XX - 

PA (MERE ) MERCK PATENT GMBH. 
XX 
PI 
XX 

DR WPI; 2001-102721/H. 
DR N-PSDB; AAC91772 
XX 
PT 
PT 

PT injury 
XX 
PS 



Den Daas I, Fischer V, Seyfried C f Von Melchner L; 



Claim 2; Page 37; 50pp; English. 

xx , - u„ mal1 acute neuronal induced calcium-binding 

CC The invention relates to human - - , fc The invention 

C C C C S e rela^s C to P Uression systems and recombinant c^J-xng, 

CC ANIC-BP DNA, the recombinant production of ^.-BP antxbodxe . ^ 

CC for ANIC-BP, ^^"^^S" £ ^atorf of ANIC-BP^ unction. 

CC Fc region, and methods of screeni g HymA and Mo25 proteins 

CC ANIC-BP has homology and *^ural simil arity JJJ stroke and 

CC ANIC-BP proteins ^ -ucleotrdes are useful ^ _ 

CC acute head trauma, multiple BCierosisa p identifying membrane bound 

CC proteins are useful m greening assays fo r^denti y 9 are 

CC or soluble receptors, and also in vaccines AN ion studieS/ 

CC useful as diagnostic reagent . ools f or t * and in 

C C C C IZ ^^SIS/SS;.^ Present seguence represents 



CC human ANIC-BP- 
XX 

SQ Sequence 341 AA; 

81.0%; Score 1381; DB 22; Length 341; 

Query Match Prpr j no. 3.2e-117; 9 ^ 

Lst\ccal Similarity 8^0%; ^ ^^^ ^ 4 . Gapg 2 , 

M!l ..v, PG 273 ; Conservative 

' Birrt DKKTDKASEEVSKSLQAMKEILCGTNEK 59 

-FSKSHIMPABIVnU^™---^^ , ,,,,,, ,,,,, 



Qy 


4 W 


Db 


i I s 


Qy 


60 I 


Db 


61 ' 


Qy 


120 


Db 


121 


Qy 


130 


Db 


181 


Qy 


240 


Db 


241 


Qy 


300 


Db 


301 



=11111111 



RESULT 6 



XX 

T. <-l 

XX 
DT 
XX 
DE 
XX 
KW 

KW 

KW 

KW 

XX 

OS Chimeric - Homo sapiens 
OS Chimeric - Unidentified 
XX 

PN WO200170771-A2 . 

XX 

PD 27-SEP-2001. 



AA310858; 

18-DEC-2001 (first entry) 
G al4-human ANIC-BP-l fusion protein. 
„_; acute neuronal -uc- 

^-"^"a e 8 ^ neimer^'disease; spina! cord injury; vaccine. 
^therapytTusion protein; Cai 4 Protein. 



XX 
PF 

XX 



20-MAR-2001; 200-1WO-EP03149 . 



PA 

XX 



pR 2X-MAR-2000; 2000EP- 0106110 . 

XX (MERE ) MERCK PATENT GMBH 

PI pen Daas I. Duecker K, Hoc* B; 

S WPI; 2001-607519/69 

XX 
PT 

PT 
PT 

PT injury 
XX 
PS 
XX 
CC 
CC 
CC 
CC 

CC 

CC 

CC 

CC 

CC 

CC 

CC piUL-^J-" a- 

CC ANIC-BP-1 protein 



WPI," 2001-607519/69. ^ 
, ! acute neuronal induced calcium bind n^t ^ yp^ le 
Novel acute . the treatment ot strow, inal c ord 

polypeptide. Alzheimer's disease and spin 
sclerosis, ParKm^ 
injury - 

• lrt ,ure- Page 42-44; 46pp; English. 
Disclosure, Pag induce d calcium binding 

protein type l I ion are useful for treatig inson , s disease, 

Sequences of the iuv multiple sclerosis, ne ,_ fu i a s 

lliah i P tools for tissue rai 4 -human ANIC-»f 

protein comprising the oai* P 
ANIC-BP-1 Protein. 

Sequence 496 AA; ^ Length 496; 

81 0%- Score 1381, »P * ' 

Hatches 273; Conservative ^^k^BEVSKS^I^K 57 



Qy 


4 1 


Db 


156 


Qy 


60 


Db 


216 


Qy 


120 


Db 


276 



215 



Qy 

Db 

Qy 

Db 

Qy 

Db 



RESULT 7 
AAE10859 

ID AAE10859 standard; Protein; 552 AA. 
XX 

AC AAE10859; 
XX 

DT 18-DEC-2001 (first entry) 

XX 

DE LexA-human ANIC-BP-1 fusion protein 
XX 
KW 
KW 
KW 
KW 
XX 

OS Chimeric - Homo sapiens. 
OS Chimeric - Unidentified. 
XX 
FH 



Human; acute neuronal induced calcium binding protein type 1 ligand; 
ANIC-BP-1 - human disease; stroke; head trauma; multiple sclerosis; 
Parkinson's disease; Alzheimer's disease; spinal cord injury; vaccine; 
gene therapy; fusion protein; LexA protein. 



Ke y Location/Qualifiers 

FT Region 1 • -2 02 

FT /note- "LexA protein" 

FT Region 203.. 552 

FT " /note= "Human ANIC-BP-1 protein" 

XX • 

PN WO200170771-A2 . 

XX 

PD 27-SEP-2001. 
XX 

PF 20-MAR-2001; 2001WO-EP03149 . 
XX 

PR 21-MAR-2000; 2000EP- 0106110 . 
XX 

PA (MERE ) MERCK PATENT GMBH. 

XX 

PI Den Daas I, Duecker K, Hock B; 
XX 

DR WPI; 2001-607519/69 
XX 
PT 
PT 
PT 

PT injury 
XX 

PS Disclosure; Page 44-46; 46pp; English 
XX 
CC 
CC 
CC 
CC 



Novel acute neuronal induced calcium binding protein type 1 ligand 
polypeptides, useful in the treatment of stroke, head trauma multiple 
sclerosis, Parkinson's disease, Alzheimer's disease and spinal cord 



The invention relates to human acute neuronal induced calcium binding 
protein type 1 (ANIC-BP-1) ligand polypeptides and polynucleotides. 
Seauences of the invention are useful for treating human diseases 
CC including stroke, head trauma, multiple sclerosis, Parkinson's disease, 
n C Alzheimer's disease and spinal cord injury. They are also useful as 
CC vaccines. ANIC-BP-1 ligands are useful for identifying membrane bound 
CC soluble receptors. Polynucleotides of the invention are useful as 
CC diagnostic reagents, for chromosome localization studies and as • . 
CC valuable tools for tissue expression studies. They are also useful m 
CC gene therapy. The present sequence is LexA-human ANIC-BP-1 fusion 



cc protein comprising the LexA protein and a C-terminally linked human 
CC ANIC-BP-I protein. 
XX 

SQ Sequence 552 AA; 

. h 81 0%- Score 1381; DB 22; Length 552; 

^^'^"^'nrSi^cSrU, xnaa l8 4, caps „ 



Matches 


273 


Qy 


4 1 


Db 


212 


Qy 


60 


Db 


2 72 


Qy 


120 


Db 


332 


Qy 


180 


Db 


3 92 


Qy 


240 


Db 


452 


QY 


300 


Db 


512 



TEAVAQLAQELYSSGLLVTLIADLQLIDFEC^ 

—ass 

: I I I M I M II • I • I i Jill ' J ' ' ' ^Itt.wsf.OFYDFFRYVEMSTFDI 391 



IIIIIIIIIMMIMMMIII Mill MMIMi I; 



I I I I I M II : : III I I : I I I I ! M • I 

KLIEFLSKFQNDRTEDEQFNDEKTYLVKQIRDLKRPA 548 



RESULT 8 
AAY94248 

ID AAY94248 standard; protein; 341. aa. 
XX 

AC AAY94 248; 
XX 

DT 10-AUG-2000 (first entry) 

XX 
DE 
XX 
KW 
KW 
KW 
XX 

OS Mus sp. 

XX 

PN VJO2000-29580-A1 . 
XX 

PD 25-MAY-2 000. 
XX 

PF 12-NOV-1999; . 99WO-US27027 . 



Mouse calcium binding protein M025. 
seizure- disorder; immune disorder; anfectron. 



XX 
PR 



13-NOV-1998; 98US-0190965 



XX 
PA 
XX 
PI 
XX 
DR 



(INCY-) INCYTE PHARM INC. 

Tang YT, Guegler KJ, Cor ley NC, Gorgone GA; 
WPI; 2000-387793/33. 



XX - ■ ^ Mip nucleic acid encoding it, useful for e.g. 

Human hCBP protein and the nuclexc a developmental 



PT 
PT 



CC 

cc 

CC 

cc 



Disclosure; Page 66-67; 72pp; English. 

- a-m^nre is the mouse calcium binding protein M025. It 
P "d" n a sequence alignment to identify human calcium binding 
™*\,Vn hCBP xne hCBP protein and the gene encoding it are 
protein hCBP. Tne n^ p treatment D f the following types of 

useful for t e ,ag ^ S ^adenocarcinomas) , reproductive disorders 
disorder: cancers (such as endometriosis, disruptions of 

(such as infertility, °™ lat °** „ lv ' ti c ovary syn drome and ovarian 

hereditary neuropathies, seizure axs ' e rosis, cholecystitis, Crohn's 

AIDS , allergies, anaemia asthma ; ^roscle ^ 
disease, diabetes , braves oib ' syndrome and ulcerative 



CC helminthic infections. 



SQ Sequence 341 AA; 

Query Match 80.8%; Score 1376; DB 21; Length 341; , 

^^r^f^a^ — - "Gaps , 

4 ---—iv™^ » 

! llPFPFGKSHKSPMIVKNLKESMAVLE^ 60 
60 EPPTEAVAQLAQELYSSGLLVTLIADLQLIDFE^ 11* 

X„ C TQQ»liUiiUlisi E liUi™^iiiUl,AK III .W S E Q PVD Fra VV E MST FD1 1.0 

181 UiiUUUil™kLiE™™ F FSE IE KLL 1 ,SE H YV T K E QSL.KI.L G ELLLDR 240 
2*C M»FAIMT K YISKPEKLK.:,»m.ELRDK S P N IQEEAFHyPKVFVASPHKTQPIVEILL™QP 2P=> 



Qy 

Db 

Qy 

Db 



Db 

Qy 



300 KLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLKKTA 336 



Illllll II Ml 11=1111=11= I 

301 IJi IE FLSKFQNDRTEDEQ^DEKTYLVKQ IRNLKRAA 337 



RESULT 9 

S G23 ^384. Standard, Protein, 354 ». 
XX 

AC ABG23 844; 
XX 

DT is-FEB-2002 (first entry) 
DE Novel human diagnostic protein #23835. 

XX 
KW 
KW 
XX 

OS Homo sapiens. 

XX 

PN WO200175067-A2 . 
XX 

PD ll-OCT-2001. 



Novel human diagnostic pJ -^^ — 



p F 30-MAR-2001; 2001WO-US08631 . 

XX 
PR 
PR 

PA (HYSE-) HYSEQ INC 

XX 
PI 
XX 

DR WPI; 2001-639362/73 
DR N-PSDB; AAS83031 



31 -MAB-2000; 2000US-0540217 . 
23-AUG-2 000; 2000US-0649167 . 



Drmanac RT. Liu C Tang YT; 



XX . ^ iAta ari(1 encoded polypeptides... useful in 

PT Hew isolated P ol y nucle °^/* ap ^ ng identif ication of mutations 
PT diagnostics, forensxcs other traits and to assess 



PT biodiversity 

PS Claim 20; SEQ ID No 54203; 103 PP ; English. 
XX , . _ -..i^pH nolvnucleotide (I) and 

CC 



The invention relates to isolate J P^^f hybridisation probes, 
polypeptide (II) p^ers, oligomers, and for chromosome 

polymerase chain reaction <P_R) *"™ e ' ct • on of (n) . The 
and gene mapping, and in re ^niagnostics as expressed sequence tags 
, , polynucleotides are also used in diagnos ^ 

CC for identifying ex Passed gen W disease gtates involving 

CC to restore normal activity of (II) o against it, detecting or 

CC (ID • (ID i* - eful f °Lf in tissL as molecular weight markers and as 
CC guantitating a P°^"*V ^"^g partners are useful in medical 
CC a food supplement. (II) and gP ^ useful for treat ing 

CC imaging of sites expressing (II) ^ (I > Qr biologica l activity. 

CC disorders involving aberrant prot P have applicat ions m 

CC The polypeptide and P ol y nucieOC ;^? identification of mutations 
CC diagnostics forensics f^^' J^^s to assess biodiversity 
CC responsible for genetic disorders 



CC 
CC 
CC 
CC 



cc 
cc 
cc 
cc 



f data and products dependent on DNA and 
and to produce other ^ 00 O 0 f 10 d S a G 3037 7 P represent nova! human 
amino acid sequences. ^ of the invention- pr inte* 

diagnostic ammo acid se f^ 1 t did not appear in g P ^ 

Note: The sequence data for^ .P electronic forma t directly 

S " 79,. Score 135. » {V 

^i^-^^jr'33rsi-^ - indeis 4; Gaps 2; 

Matches 267; Conservat DK KTDKAS E EVSK S LQAMKEILCGTN^ 59 

14 MPFPFGKSHKSPAUl tjtjoTGTRSPTVEYI 119' 

- -•sSISESSHuusi ... 

74 ppQTEAGAQLAQELYNSGLLITLV Bnw «yvFLSTFt»I 179 ' 



QY 

Db 

QY 
Db 

Qy 

Db 

Qy 



Qy 

Db 

Qy 

Db 



RESULT 10 

AAB20387 ^ H *rd- Protein; 350 AA. 

ID AAB20387 standard, 

XX 

A C AAB2 0387; 
XX 
DT 
XX 
DS 
XX 
KW 
KW 
KW 
KW 
KW 
XX 

OS Homo sapiens 



-tttxt onm (first entry) 
H-JUN-2001 \ J np_iB 

_ _ — — — — .-^ rr ' 

M « neuronal innnc.n »l=i- S^T^^T niseas.; 

ebroprotectxve; antxpax 
c^enx^F • vaccine- 

tberapy; diagnosis, 



PK WO200125423-A1. 



cc 
cc 



XX 

PD 12-APR-2001- 
XX 

PF 28-SEP-2000; 2000WO-EP09475 
XX 

PR 04-OCT-1999; 99EP-0119113 
XX 

PA (MERE ) MERCK PATENT GMBH. 

XX 

PI Duecker K, Den Daas I; 
XX 

DR WPI; 2001-266306/27. 

DR N-PSDB; AAF30688. 
XX 

PT Novel human acute n^iunax — H . raMa 

. 4- „r*~-F^^^ fnr freatinq stroke, acute head, ^.rauma, 

PT cord injury - 

yx 

PS Claim 2; Page 44-45; 49pp; English. 

XX fh.t- nf a novel human acute neuronal induced 

CC The present sequence is that ot a novex nu N p-i B 

CC ine pioLcxn oi »vT-rn R p a nrotein discovered bv mRNA 

CC protein family, including ANIC-BP, aprotein 

? :rtX ^C-BP^ir^ 

, cc. The variant, protein could serve as a novel drug target . The 



N ovel human acute neuronal induced calcium-binding P-ein^e protein 



invent ion provides ANIC-BP- IB polynucleotides (see AAF30688, and 
ioI^ePtides, expression vectors ,. host cells and antibodies, as 
Sr as method, for producing the protein and for. treat xng or 
preventing disorders associated with expression of the P-ein^by 
inhibiting or act„ Parkinson ,. s 

Hease Alzheimer's disease, multiple sclerosis and spinal cord 
iniury ?he Polynucleotides and polypeptides can also be used on 
Hiaqnostic assays and in vaccines, and to identify agonists ana 
antagonists use ul for treating conditions associated with 



uv- 
ea 
cc 
cc 
cc 

CC injury. 
CC 

CC 

CC ANIC-BP -IB imbalance. 

XX 

3Q Sequence 350 AA; 



Query Match 76.1%; Score 1297^5 DB22 ; Length 350; 

-is 13; Oaps : 

1 mpfpfgKSHKSPADIVKN^ 60 
f0 EPPTEAVAQLAQEL x S SGLL VTLI ADIiQLIDFEGKKDVTQ I VT^? ^ Tl^f ^'hm'T "[ " 

121 CTQOBiiULiUUsiEiiUiliiiEcliHEPLAKIILWSEQf-VDFFRYVEMSTP-DI 18 



Db 



Db 



18! iiiUiUUliliiKisiE^HliR^SElEKLX.BSENYVTKEQSLKI.LCKt.LLDR 240 

cy 100 KLIEFLSSFQKERTD DEQFADEKNYLIKQIRDLKKTA 336 

Db 301 l^IEFLSKFQNDRTDCMSSSVPTTNSRVDLRVKPRTRGIRDLKRPft 346 

RESULT 11 
AAM40864 

ID AAM40864 standard; Protein; 237 AA. 
XX 

AC AAM40864; 
XX 

DT 22-OCT-2001 (first entry) 
XX 1 

DE Human polypeptide SEQ ID NO 5795. 
XX 
KW 
KV1 
KW 



oeripheral nervous a y D tera, neuropa y disease; haemostatic; 

" ^n: , \S^i^SrS»^ t S^.4 Atactic,. 



KW leukaemia. 
XX 

OS' Homo sapiens . 

XX 

PN WO200153312-A1. 
XX 

PD 26-JUL-2001- 

XX 

PF 26-DEC-2000; 2000WO-US34263 . 
XX 

PR 21-JAN-2000; 2000US-0488725 . 

PR ->5-APR-2000; 2000US-0552317 . 

PR 09-JUL-2000; 2000US-0598042 . 

PR 1D-JUI.-2000; 2000US-0620312 . 

PR 03-AUG-2000; 2000US-0653450 . 

PR 14-SEP-2000; 2000US-0662191 . 

PR -19-OCT-2000; 2000US-0693036 . 

PR 2S-NOV-2000; 2000US-0727344 . 
XX 

PA {HYSE-) HYSEQ INC. 



XX xi v Then R Ma Y, Qian XB, Ren F, Wang D; 

" ItZ S: — V T. "re" X„« Van, V, 2 H.„ 9 „ 

"LI OA, 7,hoo P, Goodrich R, »™c RT; 



Pi 
Pi 

XX 

DR WPI., 2001-442253/47 

DR N-PSDB; AAI60020. 
XX 



PT Novel nucleic acids and polypeptides , useful for treating disorders 

PT such as central nervous system injuries - 

XX 

PS Example 2; SEQ ID NO 5795; 10078pp; English. 

XX 

CC The invention relates to human nucleic acids (AAI57798-AAI61369) and 

CC the encoded polypeptides (AAM38642-AAM42213) with nootropic, 

CC immunosuppressant and cytostatic activity. The polynucleotides are useful 

CC in aene therapy. A composition containing a polypeptide or polynucleotide 

CC of the invention may be used to treat diseases of the peripheral nervous 

CC system, such as peripheral nervous injuries, peripheral neuropathy and 

CC localised neuropathies and central nervous system diseases, such as 

CC Alzheimer's, Parkinson's disease, Huntington's disease, amyotrophic 

CC lateral sclerosis, and Shy-Drager Syndrome. Other uses include the 

CC utilisation of the activities such as: Immune system suppression, 

rc Activin/inhibin activity, chemotactic/chemokinetic activity, haemostatic 

CC and thrombolytic activity, cancer diagnosis and therapy, drug screening 

CC assays for receptor activity, arthritis and inflammation, leukaemias and 

CC C.N.S disorders. . 

CC Mote : The sequence data for this patent did not form part of the printed 

CC specification. 

XX 

SO Sequence 237 AA; 

Query Match 63.2%; Score 1162: DB 22; Length 237; 

' Best Local Similarity 100.0%; Pred. No. 1.6e-97; 

• Matches 227; Conservative 0; Mismatches 0; Indels 0; Caps ... 0; 

TRSPTVEYISAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFK 170 

11111111111111111111111111 IIMIIIIMIIMIIIIIIIIIIIMMIII II 

Db 2 TRSPTVEYISAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFK 61 

171 YVELSTFDIASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSLK 23 0 

| | | | | || i | | ! | | I i I I I I I I I I I I I I I I I II I I I I I I I ! I I I I I I I I I I I I I I I I I I I I 
52 YVELSTFDIASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSLK 121 



Qy 111 



Qy 
Db 

Qy 231 



LLGELILDRHNFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVAS.PHKTQPI 2 90 

1 1 1 1 1 1 1 1 1 1 1 1 I II Ullll 1 1 II MM II Ml II I MM Ml N N HMN . fl1 

Db 122 LLGFJIjI LDRHNFAIMTKYI S KPENLKLMMNLLRDKS PNI QFE AFHVFKVFVAS PHKTQP i 181 

Ov 291 VEILLKNQPKLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLKKTAP 337 

Q/ j I | I | || | | | || I I I I I I I I II I I I M I I I II I I I I I I I I M I II I I 

Db 182 VEILLKNQPKLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLKKTAP 228 



RESULT 12 
ABB60392 

ID ABB60392 standard; Protein; 339 AA. 
XX 

AC ABEG03 92; 
XX 

DT 26-MAR-2002 (first entry) 
XX 

DE Drosophila melanogaster polypeptide SEQ ID NO 7968. 
XX 

KW Drosophila; developmental biology; cell signalling; insecticide; 
KW pharmaceutical . 



XX 

OS Drosophila melanogaster . 
XX 

PN WO200171042-A2 . 
XX 

PD 27-SEP-2001. 
XX 

PF 23-MAR-2001; 2001WO-US0923 1 - 
XX 

PR 23-MAR-2000; 2000US- 191637P . 

PR ll-JUL-2000; 2000US- 0614 15 0 . 
XX 

PA (PEKE ) PE CORP NY. 
XX 

PI Venter JC, Adams M, Li PWD, Myers EW; 
XX 

DR WPI; 2001-656860/75. 

DR N-PSDB; ABL04495 . 
XX 

PT New isolated nucleic acid detection reagent for detecting 1000 or more 

PT genes from Drosophila and for elucidating cell signalling and cell-cell 

PT interactions - 
XX 

PS Disclosure; SEQ ID NO 7968; 21pp + Sequence Listing; English. 

XX 

CC The invention relates to an isolated nucleic acid detection reagent. 

CC capable of detecting 1000 or more genes from Drosophila. The invention is 

CC useful in developmental biology and in elucidating cell signalling and 

CC cell-cell interactions in higher eukaryotes for the development of =■■-■■' 

CC insecticides, therapeutics and pharmaceutical drugs. The invention • 

CC discloses genomic DNA sequences (ABL16176-ABL30511) , expressed DNA 

CC sequences (ABL01840 -ABL16175) and the encoded proteins 

CC (ABB57737-ABB72072) . 

CC The sequence data for this patent did not form part of the printed 

specification, but was obtained in electronic format directly from WIPO 
at ftp.wipo.int/pub/published_pct_sequences. 



CC 
XX 

SQ Sequence 33 9 AA; 



Query Match 65.2%; Score 1111; DB 22; Length 339; 

Best Local Similarity 65.0%; Pred. No. 1.2e-92; .« 

Matches 217; Conservative 59; Mismatches 54; Indels 4; Gaps 

MPLFSKSHKNPAEIVKILKt)NLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPT 

MM II hi hi I I h ; II hi Ml hi MM -I M h- III 

MPLFGKSQKSPVELVKSLKEAINALEAGDRKVEKAQEDVSKNLVSIKNMLYGSSDAEPPA 
E-AVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYISAH 

: Mlhll I hi Ih M M III MM I i II h I II I I I II II I I M I 

DY WAQLSQEL YNSNLLLLL I QNLHRIDFEGKKH VAL I FNNVLRRQ IGTRS PTVE Y I CTK 
PHILFMLLKGYE- -APQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDIA 

I III h III hill I Mill hi III hi hM lhllhllll.il 

PEILFTLMAGYEDAHPEIALNSGTMLRECARYEALAKIMLHSDEFFKFFRYVEVSTFDIA 
SDAFATFKDLLTRHKVLVADFLEQNYDTIF-EDYEKLLQSENYVTKRQSLKLLGELILDR 

M | h M h I 1 1 II h I hlh Ml I = hMI M I II h M II 1 1 M I h I II 



Qy 


4 


Db 


1 


Qy 


64 


Db 


'61 


Qy 


123 


Db 


121 


Qy 





Db 



181 SDAFSTFKELLTRHKLLCAEFLDANYDKFFSQHYQRLLNSENYVTRRQSLKLLGELLLDR 24 0 



Qy 



240 HNFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQP 2 99 



Db 




Qy 



30 0 KL I EFLS S FQKERTDDEQFADEKNYL I KQ I RDLK 33 3 



Db 




RESULT 13 
AAY94249 

ID AAY94249 standard; protein; 339 AA. 

XX 

AC AAY9424 9; 
XX 

DT 10-AUG-2000 (first entry) 

XX 

DE Drosophila calcium binding protein DM025 . 

XX 

KW Drosophila; calcium binding protein; cancer; inflammation; DM025; CBP; 
KW reproductive disorder; autoimmune disorder; developmental disorder; 
KW seizure disorder; immune disorder; infection. 

XX 

OS Drosophila melanogaster . ; = 

XX 

PN W02 0 0 0 2 9 5 8 0 - Al . 

xx • ' ' 

PD 25-MAY-2000. 

XX 

PF 12-NOV-1999; 99WO-US27 02 7 . 

XX 

PR • 13-NOV-1998; 98US -0190965 . 



PA ( INC Y - ) INCYTE PHARM INC . 

XX 

PI Tang YT, Guegler KJ, Corley NC, Gorgone GA; 

XX 

DR WPI; 2000-387793/33. 

XX ' 

PT Human hCBP protein, and the nucleic acid encoding it, useful for e.g. 

?T diagnosis, prevention and treatment of cancers, immune, developmental 

PT or reproductive disorders - : 

XX 

PS Disclosure; Page 67-68; 72pp.; English. 
XX 

CC The present sequence is the Drosophila calcium binding protein DM025. 11 

CC was used in a sequence alignment to identify human calcium binding 

CC protein hCBP. The hCBP protein and the gene encoding it are 

CC useful for the diagnosis and treatment of the following types of 

CC disorder: cancers (such as adenocarcinomas), reproductive disorders 

CC {such as infertility, ovulatory defects, endometriosis, disruptions of 

CC the oestrus and menstrual cycles, polycystic ovary syndrome and ovarian 

CC hyperstimulation) , autoimmune disorders (such as benign prostatic 

CC hyperplasia and prostatitis) , developmental disorders (such as 

CC Cushing's syndrome, muscular dystrophy and gonadal dysgenesis), 



XX 



CC hereditary neuropathies, seizure disorders, immune disorders (such as 

CC AIDS allergies , anaemia, asthma, atherosclerosis, cholecystitis, Crohn's 

CC disease, diabetes, Graves' disease, multiple sclerosis, psoriasis 

CC rheumatoid arthritis, scleroderma, Sjogren's syndrome and ulcerative 

CC colitis), and viral, bacterial, fungal, parasitic, protozoal and 

CC helminthic infections. 

XX 

SQ Sequence 339 AA; 

Query Match 65.1%; Score 1109; DB 21; Length 339; 

Best Local Similarity 65.0%; Pred. No. 1.8e-92; 

Matches 217; Conservative 59; Mismatches 54; Indels 4; Gaps 
rv 4 MPLFSKSHKNPAEIVKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEP^ 63 

^ I I I I II 1:1 | : | | | | : • II 1 = 1 =11 1 = 111 = 1 =1 1 = = = III 

Db 1 MPLFGKSQKSP^ 60 

Qy 64 E-AVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVT "2. 

Qy . nil :| |||: I lh II =1 Mill IN MINI MINIM I INI 

Db 61 dywaqLqelynsnllllliqnlh^ 120 

Qv 123 PHILFMLLl^ 180 

I III I : Ml hill I Mill hi M I hi hM IhllhllllM 

Db 121 PEILFTLMAGYEDAHPEIALNSGTMLRECARYEALAKIMLHSDEFFK^ 180- 

Qy 181 SDAFATFKDLLTRHKVLVADFLEQNYDTIF-EDYEKLLQS^ 239 



Db 



MM Ml hi M II I M | : 1 1 : IM I = I - 1 1 II II II M II I M I II h . . . 

131 SDAFSTFKELI/rRH 240 
Ov 240 HNFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQP 299- 

y ^ in . i : : 1 1 1 : 1 1 1 1 ! 1 1 1 h h- M I II ! M M II II 1 1 1 h h I Mi -I I hi I 

211 MNFTVMTRYISEPEN^ 300 



Db 



nv 300 TCL1EFLSSFQKERTDDEQFADEKNYL1KQIRDLK 333 

QY ^ ||::|hM M-IMI Ml IIIMhMI 

Db 301 KLVDFLTNFHTDRSEDEQFNDEKAYL I KQIKELK 3 34 



RESULT 14 
AAY94 25 0 

ID AAY94250 standard; protein; 377 AA. 

XX 

AC AAY94250; 
XX 

DT 10-AUG-2000 (first entry) 
XX 
DE 
XX 

kw 

KW 
KW 
XX 

OS Caenorhabditis elegans. 
XX 

PN WO200029580-A1 . 
XX 

PD 25 --MAY- 2000 . 



C. elegans yeast-like calcium binding protein. 

Calcium binding protein; cancer; inflammation; yeast-like CBP; CBP; 
reproductive disorder; autoimmune disorder; developmental disorder; . - 
seizure disorder; immune disorder; infection. 



XX 

PF 12-NOV-1999; 99WO-US27027 . 
XX 

PR 13-NOV-1998; 98US-0190965 . 
XX 

PA (INCY-) INCYTE PHARM INC. 
XX 

PI Tang YT, Guegler KJ, Corley NC, Gorgone GA; 
XX 

DR WPI; 2000-387793/33. 
XX 

PT Human hCBP protein, and the nucleic acid encoding it, useful for e.g. 

PT diagnosis, prevention and treatment of cancers, immune, developmental 

PT or reproductive disorders - 
XX 

PS Disclosure; Page 68-69; 72pp; English. 

XX 

CC The present sequence is the C. elegans yeast-like CBP . It 

CC was used in a sequence alignment to identify human calcium binding 

CC protein hCBP. The hCBP protein and the gene encoding it are 

CC useful for the diagnosis and treatment of the following types of 

CC disorder: cancers (such as adenocarcinomas), reproductive disorders 

CC (such as infertility, ovulatory defects, endometriosis, disruptions of 

CC the oestrus and menstrual cycles, polycystic ovary syndrome and ovarian 

CC hyperstimulation) , autoimmune disorders (such as benign prostatic 

CC hyperplasia and prostatitis) , developmental disorders (such as 

CC dishing' s syndrome, muscular dystrophy and gonadal dysgenesis), 

CC hereditary neuropathies, seizure disorders, immune disorders (such as 

CC AIDS, allergies, anaemia, asthma, atherosclerosis, cholecystitis, Crohn's 

CC disease, diabetes , Graves ' disease, multiple sclerosis, psoriasis, 

CC rheumatoid arthritis, scleroderma, Sjogren's syndrome and ulcerative 

CC colitis) f .and viral, bacterial, fungal, parasitic, protozoal and . ' = 

CC helminthic infections. 

XX 

SQ Sequence 377 AA; 

Query Match 62.4%; Score 1063.5; DB 21; Length 377; 

Best Local Similarity 60.5%; Pred. No. 2.8e-88; 

Matches 211; Conservative 53; Mismatches 68; Indels 17; Gaps 3 

MP-LFSKSHKNPAEIVKILKDNLAILEK QDKKTDKASEEVSKSLQAM 49 

|| || ||||:||::|| h = i Ihl Ml Ml vl M h : = 

MPLLFGKSHKSPADWKTLREVLTILDKLPPPKLDKDGNIQSDKKYDKALDEVSKNVAMI 6 0 

KE ILCGTNEKEPPTE ■- - - AVAQLAQELYS SGLLVTL I ADLQL IDFEGKKDVTQI FNNI LR 106 
| : | : MM IMMIhh: :| || | :|| |||| ||l!hll 



1 1 1 1 1 1 1 1 1 1 1 1 = I I M hMI I Ml 1 1 - 1 1 M Mh lllllhh I 

RQIGTRSPTVEYLGARPEILIQLVQGYSVPDIALTCGLMLRES IRHDHLAKI ILYSDVFY 18 0 
DFFKYVELSTFDIASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKR 226 

M |h MhlMhMhl MM : = hlh MM I h M 1 = 1111 = 1 



Qy 


4 


Db 


1 


Qy 


50 


Db 


61 


Qy 


107 


Db 


121 


Qy 


167 


uh 


181 



Qy 



227 QSLKLLGELILDRHNFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHK 286 



IIMMMhMMII MINI hlhlll MINI llhllllMllllhhl 

D b 241 QS LKLLGELLLDRHNFNTMTKY I SNPDNLRLMMELLRDKSRNI Q YEAFHVFKVF VANPNK 300 

Oy 2 87 TQPIVEILLKNQPKLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLKKT 335 

:|| :|| :h I hi Ml I M I I I M I I M M I M h - I * 
Db" 301 PKP I SD I LNRNRE KLVEFLSEFHNDRTDDEQFNDEKAYL IKQ I QEMKSS 349 



343 AA. 



18-OCT-2000 (first entry) 

Arabidopsis thaliana protein fragment SEQ ID NO: 56816. 

Protein identification; signal transduction pathway; metabolic pathway; 
hybridisation assay; genetic mapping; gene expression control; promoter; 
termination sequence. 

Arabidopsis thaliana. 

EP1033405-A2 . 

06-SEP-2000. 



RESULT 15 
AAG45273 

ID AAG45273 standard; Protein; 
XX 

AC AAG45273; 
XX 
DT 
XX 
DE 
XX 
KW 
KW 
KW 
XX 
OS 
XX 
PN 
XX 
?D 
XX 
?F 
XX 

PR 25-FEB-1999 

PR -05-iMAR-1999 

PR 09-MAR-1999 

PR 23-MAR-1999 

PR 25-MAR-1999 

PR 29-MAR-1999 

PR 01-APR-1999 

PR 06-APR-1999 

PR 08-APR-1999 

PR 16-APR-1999 

PR 19-APR-1999 

PR 21-APR-1999 

PR 23-APR-1999 

PR 23-APR-1999 

PR 28-APR-1999 

PR 30-APR-1999 

PR 30-APR-1999 

PR 04-MAY-1999 

PR 05-MAY-1999 

PR 06-MAY-1999 

PR 06-MAY-1999 

PR 07 -MAY -1999 

PR ll-MAY-1999 

PR 14-MAY-1999 

PR 14-MAY-1999 

PR 14-MAY-1999 

PR 14-MAY-1999 



25-FEB-2000; 2000EP-0301439 . 



99US-0121825 . 
99US-0123180 . 
99US-0123548 . 
99US-0125788. 
99US-0126264 . 
99US-0126785 . 
99US-0127462 . 
99US-0128234 . 
99US-0128714 . 
99US-0129845 . 
99US-0130077 . 
99US-0130449. 
99US-0130510 . 
99US-0130891 . 
99US-0131449 . 
99US-0132048 . 
99US-0132407 . 
99US-0132484 . 
99US-0132485 . 
99US-0132486 . 
99US-0132487 . 
99US-0132863 . 
99US-0134256 . 
99US-0134218 . 
99US-0134219 . 
99US-0134221 . 
99US-0134370 . 



PR 


18- 


MAY- 


1999; 


PR 


19- 


MAY- 


1999; 


PR 


20- 


MAY- 


1999; 


PR 


21- 


MAY- 


1999; 


PR 


24- 


MAY- 


1999; 


PR 


25- 


MAY- 


1999; 


PR 


27- 


MAY- 


1999; 


PR 


28- 


MAY- 


1999; 


PR 


01- 


JUN- 


1999; 


PR 


03- 


JUN- 


1999; 


PR 


04- 


JUN- 


1999; 


PR 


07- 


JUN- 


1999; 


PR 


08- 


JUN- 


1999; 


PR 


10- 


- JUN- 


1999; 


PR 


10- 


-JUN- 


1999; 


PR 


14- 


-JUN- 


1999; 


PR 


16- 


- JUN- 


1999; 


PR 


16- 


-JUN- 


1999; 


PR 


17- 


-JUN- 


1999; 


PR 


18- 


-JUN- 


1999; 


PR 


18- 


-JUN- 


1999; 


PR 


18- 


-JUN- 


1999; 


PR 


18- 


-JUN- 


1999, 


PR 


18- 


-JUN- 


-1999, 


PR 


18 


-JU1SI- 


-1999, 


PR 


18- 


-JUN- 


-1999, 


PR 


18- 


- JUN 


-1999, 


PR 


18- 


-JUN- 


1999, 


PR 


18- 


-JUN- 


-1999, 


PR 


18 


-jim- 


-1999, 


PR 


18 


-JUN- 


-1999, 


PR 


21 


-JUN- 


-1999 


PR 


22 


-JUN- 


-1999 


PR 


23 


-JUN- 


-1999 


PR 


23 


-JUN- 


-1999 


PR 


24 


-JUN 


-1999 


PR 


28 


-JUM- 


-1999 


PR 


29 


- JUN- 


-1999 


PR 


30 


-JUN 


-1999 


PR 


01 


-JUL 


-1999 


PR 


01 


-JUL 


-1999 


PR 


02 


-JUL 


-1999 


PR 


06 


-JUL 


-1999 


PR 


08 


-JUL 


-1999 


PR 


09 


-JUL 


-1999 


PR 


12 


-JUL 


-1999 


PR 


13 


-JUL 


-1999 


PR 


14 


-JUL 


-1999 


PR 


15 


-JUL 


-1999 


PR 


16 


-JUL 


-1999 


PR 


16 


-JUL 


-1999 


PR 


19 


-JUL 


-1999 


PR 


19 


-JUL 


-1999 


PR 


19 


-JUL 


-1999 


PR 


19 


-JUL 


-1999 


PR 


19 


-JUL 


-1999 


PR 


19 


-JUL 


-1999 



99US-0134768. 
99US-0134941 . 
99US-0135124 . 
99US-0135353-. 
99US-0135629 . 
99US-0136021. 
99US-0136392 . 
99US--0136782 . 
99US-0137222 . 
99US-0137528 . 
99US-0137502 . 
99US-0137724 . 
99US-0138094 . 
99US-0138540 . 
99US-0138847 . 
99US-0139119. 
99US-0139452 . 
99US-0139453 . 
99US-0139492 . 
99US-0139454 . 
99US-013945S. 
99US-0139456 . 
99US-0139457 . 

99US-0139458 . 

99US-0139459. 

99US -0139460 . 

99US-0139461 . 

99US-0139462 . 

99US-0139463 . 

99US-0139750 . 

99US-0139763 . 

99US-0139817 . 

99US-013989S . 

99US-0140353 . 

99US-0140354 . 

99US-0140695 . 

99US-0140823 . 

99US- 0140 991 . 

99US-0141287 . 

99US-0141842 . 

99US-0142154 . 

99US-0142055 . 

99US-0142390 - 

99US-0142803 . 

99US-0142920 . 

99US-0142977 . 

99US-0143542 . 

99US-0143624 

99US-0144005 

99US-0144085 

99US-0144086 

99US-0144325 

99US-0144331 

99US-0144332 

99US-0144333 

99US-0144334 

99US-0144335 



PR 


20- 


JUL 


1999; 


PR 


20- 


JUL- 


1999; 


PR 


20- 


JUL- 


1999; 


PR 


21- 


JUL- 


1999; 


PR 


21- 


JUL- 


1999; 


PR 


21- 


JUL- 


1999; 


PR 


22- 


JUL- 


1999; 


PR 


22- 


JUL- 


1999; 


PR 


22- 


JUL- 


1999; 


PR 


22- 


JUL- 


1999; 


PR 


23- 


JUL- 


1999; 


PR 


23- 


JUL 


1999; 


PR 


23- 


JUL- 


1999; 


PR 


26- 


JUL- 


1999; 


PR 


27- 


JUL- 


1999; 


PR 


27- 


JUL- 


1999; 


PR 


27- 


JUL- 


1999; 


PR 


28- 


JUL- 


1999; 


PR 


02- 


-AUG- 


1999, 


PR 


02- 


-AUG- 


1999, 


PR 


02- 


-AUG- 


1999, 


PR 


03- 


-AUG- 


1999, 


PR 


04- 


-AUG- 


1999, 


PR 


04- 


•AUG- 


1999 


PR 


05- 


-AUG- 


1999 


PR 


05- 


-AUG- 


1999 


PR 


06- 


-AUG- 


-1999 


PR 


06- 


-AUG- 


1999 


PR 


09- 


-AUG- 


-1999 


PR 


09- 


-AUG- 


-1999 


PR 


10 


-AUG- 


-1999 


PR 


11 


-AUG- 


-1999 


PR 


12 


-AUG 


-1999 


PR 


13 


-AUG- 


-1999 


PR 


13 


-AUG- 


-1999 


PR 


16 


-AUG- 


-1999 


PR 


17 


-AUG 


-1999 


PR 


18 


-AUG 


-1999 


PR 


20 


-AUG 


-1999 


PR 


20 


-AUG 


^1999 


PR 


20 


-AUG 


-1999 


PR 


23 


-AUG 


-1999 


PR 


23 


-AUG 


-1999 


PR 


25 


-AUG 


-1999 


PR 


26 


-AUG 


-1999 


PR 


27 


-AUG 


-1999 


PR 


27 


-AUG 


-1999 


PR 


27 


-AUG 


-1999 


PR 


30 


-AUG 


-1999 


PR 


31 


-AUG 


-1999 


PR 


01 


-SEP 


-1999 


PR 


07 


-SEP 


-1999 


PR 


10 


-SEP 


-1999 


PR 


13 


-SEP 


-1999 


PR 


15 


-SEP 


-1999 


PR 


16 


-SEP 


-1999 


PR 


20 


-SEP 


-1999 



99US-0144352 . 

99US-0144632 . 

99US-0144884 . 

99US-0144814 . 

99US-0145086 - 

99US-0145088 . 

99US-0145085. 

99US-0145087 . 

99US-0145089. 

99US-0145192 . 

99US-0145145 . 

99US-0145218. 

99US-0145224 . 

99US-0145276 . 

99US-0145913 . 

99US-0145918 . 

99US-0145919 . 

99US-0145951 . 

99US-0146386 . 

99US-0146388 . 

99US-0146389 . 

99US-0147038 . 

99US-0147204 . 

99US-0147302 . 

99US-0147192 . 

99US-0147260 . 

99US-0147303 . 

99US-0147416 . 

99US-0147493 . 

99US -0147935 - 

99US -0148171 . 
99US-0148319 . 
99US-0148341 . 
99US-0148565 . 
99US-0148684 . 
99US-0149368 . 
99US-0149175 . 
99US -0149426 - 
99US-0149722 . 
99US-0149723 - 
99US-0149929 . 
99US-0149902 . 
99US-0149930 . 
99US-0150566 . 
99US-0150884 . 
99US-0151065 . 
99US-0151066 . 
99US-0151080 . 
99US-0151303 . 
99US-0151438 
99US-0151930 
99US-0152363 
99US-0153070 
99US-0153758 
99US-0154018 
99US-0154039 
99US-0154779 



PR 


22- 


SEP- 


1999; 


99US- 


015513 9 . 


PR 


23- 


SEP- 


1999; 


99US -■ 


t c c a o a 


PR 


24- 


SEP- 


1999; 


99US- 


0155659 . 


PR 


28- 


SEP- 


1999; 


99US- 


0156458 . 


PR 


29- 


SEP- 


1999; 


99US- 


0156596 . 


PR 


04- 


OCT- 


1999; 


99US- 


0157117 . 


PR 


05- 


OCT- 


1999; 


99US- 


0157753 . 


PR 


06- 


OCT- 


1999; 


99US- 


0157865 . 


PR 


07- 


OCT- 


1999; 


99US- 


0158029 . 


PR 


08- 


OCT- 


1999; 


99US- 


0158232 . 


PR 


12- 


OCT- 


1999; 


99US- 


015 8369 . 


PR 


13- 


OCT- 


1999; 


99US- 


01592 93 . 


PR 


13- 


OCT- 


1999; 


99US- 


0159294 . 


PR 


13- 


OCT- 


1999; 


99US- 


0159295 . 


PR 


14- 


-OCT- 


1999; 


99US- 


-015932 9 . 


PR 


14- 


OCT- 


1999, 


99US- 


-0159330 . 


PR 


14- 


-OCT- 


1999, 


9 9US- 


-0159331 . 


PR 


14- 


-OCT- 


-1999, 


99US- 


-0159637 . 


PR 


14 : 


-OCT- 


-1999, 


99US- 


-015963 8 . 


PR 


18- 


-OCT- 


-1999 


99US- 


-0159584 . 


PR 


21- 


-OCT- 


-1999 


99US- 


-0160741 . 


PR 


21 


-OCT- 


-1999 


9 9US- 


-0160767 . 


PR 


21 


-OCT- 


-1999 


99US- 


- 01607o8 . 


PR 


21 


-OCT- 


-1999 


9 9US 


-0160770 . 


PR 


21 


-OCT 


-1999 


; 99US 


-0160814 . 


PR 


21 


-OCT 


-1999 


99US 


-0160815. 


PR 


22 


-OCT 


-1999 


9 9US 


-0160980 . 


PR 


22 


-OCT 


-1999 


99US 


-0160981 . 


PR 


22 


-OCT 


-1999 


99US 


-0160989 . 


PR 


25 


-OCT 


-1999 


99US 


-0161404 . 


PR 


25 


-OCT 


-1999 


; 9 9US 


-01614 05 . 


PR 


25 


-OCT 


-1999 


99US 


-0161406 . 


PR 


26 


-OCT 


-1999 


99US 


-0161359 . 


PR. 


26 


-OCT 


-1999 


99US 


-0161360. 


PR 


26 


-OCT 


-1999 


99US 


-0161361 . 


PR 


2 8 


-OCT 


-1999 


99US 


-0161920 . 


PR 


28 


-OCT 


-1999 


99US 


-0161992 . 


PR 


23 


-OCT 


-1999 


99US 


-0161993 . 


PR 


29 


-OCT 


-1999 


99US 


-0162142 . 



Query Match 



Gaps 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 



42.0%; Score 716.5; DB 21; Length 343; 
BePr' Local Similarity 42.9%; Pre.d. No. 9.3e-57; 

Matches 144; Conservative 78; Mismatches 105; Indels 9; 

6 LFSKSHKNPAEIVKILKDNLAILEK QDKKTDKAS EE VS KS LQAMKE I LCGTNE 58 

II : ! | : | | : : | | :: - I - I = I II = = ' I I I I . = I 

4 LFKS KPRTPADI VRQTRDLLLYADRSNS FPDLRES KREEKMVELSKS I RDLKL ILYGNSE 63 

59 KEPPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEY 118 

|| || IN || : : I h I ■■ I : I I 11= 1= 1 = 1= =1 : l 
54 AEPVAEACAQLTQEFFKADTLRRLLTSLPNLNLEARKDATQWANLQRQQVNSRLIAADY 123 

> - o ISAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFD 178 

^ : :: :: |: |:| =11 I I I I I I I 1= = I I = I I ! " l' = l . II 
124 LESNIDLMDFLVDGFENTDMALHYGTMFRECIRHQIVAKYVLDSEHVKKFFYYIQLPNFD 183 



17 9 I ASDAFATFKDLLTRHKVL VADFLEQNYDT I FEDY --EKLLQSENYVTKRQSLKLLUELIL 237 



Il-ll lllhllllll 11-11 ; I I I II 111 = 1 1 1 = I = 1 1 = = 1 1 1 1 = = = I 

184 IAADAAATFKELLTRHKSTVAEFLIKNKDWFFADYNSKLLESTNYITRRQAIKLLGDILL 243 

23 8 DRHNFAIMTKYISKPENLKLMMNLLRDKS PN I QFEAFHVFKVFVASPHKTQP IVE I LLKN 2 97 

II I |:| | I hi = I h = = I I I II : I II I I I I I I I : I I I : : I I I I ' I n , 
244 DRSNS AVMTKYVS SMDNLR I LMNLLRES S KT IQ I EAFHVFKLFVANQNKP SD I AN I LV AN 3 03 

2 98 OPKLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLK 333 

: | | : h ■ ■■■■ : I l : I =1 - = = I : I I 

3 04 RNKLLRLLADIKPDK-EDERFDADKAQWREIANLK 338 



ch completed: January 7, 2 004, 16:47:07 
time : 55 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 



Run on: January 7, 2004, 16:44:17 ; Search time 21 Seconds 

(without alignments) 
678.989 Million cell updates/sec 

Title: US-10- 08 8-872-2 

Perfect score: 1704 

Sequence: 1 MKKMPLFSKSHKNPAEIVKI FADEKN YL I KQ I RDLKKTAP 337 



Scoring table: BL0SUM62 

Gapop 10.0 , Gapext 0.5 

Searched: 328717 seqs, 42310858 residues 



Total number of hits satisfying chosen parameters: 328717 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



Database : Issued_Patents_AA: * 

1 : / cgn2_6/ptodata/ 1/iaa/ 5A_COMB . pep : * 

2 : /cgn2_6/ptodata/l/iaa/5B_COMB.pep: * 

3 : /cgn2_6/ptodata/ 1/iaa/ 6A_COMB . pep : + 

4 : /cgn2_6/ptodata/l/iaa/6B_COMB.pep: * 

5 : / cgn2_6/ptoda ta/ 1/ iaa/ PCTUS_COMB . pep : * 

6: / cgn2_6/ptodata/l/iaa/backf ilesl .pep: * 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 

% 

Result Query 



No. 


Score 


Match 


Length 


DB 


ID 








Description 




1 


1704 


100. 


0 


337 


3 


us- 


09- 


190- 


965-1 


Sequence 


1, 


Appli 


2 


1704 


100. 


0 


337 


4 


us- 


09- 


470- 


253-1 


Sequence 


1, 


Appli 


3 


1376 


80. 


8 


341 


3 


us- 


09- 


190- 


965-3 


Sequence 


3, 


Appli 


4 


1376 


80. 


8 


341 


4 


us- 


09- 


470- 


253-3 


Sequence 


3, 


Appli 


5 


1109 


65. 


1 


339 


3 


us- 


09- 


190- 


965-4 


Sequence 


4, 


Appli 


6 


1109 


65. 


1 


339 


4 


us- 


09- 


470- 


253-4 


Sequence 


4, 


Appli 


7 


1063.5 


62. 


4 


377 


3 


us- 


09- 


190- 


965-5 


Sequence 


5, 


Appli 


8 


1063.5 


62. 


4 


377 


4 


us- 


09- 


470- 


253-5 


Sequence 


5, 


Appli 


9 


128.5 


7. 


5 


3878 


4 


us- 


09- 


914- 


259-11 


Sequence 


11 


, Appl 


10 


113.5 


6. 


7 


1279 


4 


us- 


09- 


724- 


517-2 


Sequence 


2, 


Appli 


11 


113.5 


6. 


7 


1279 


4 


us- 


09- 


641- 


807A-2 


Sequence 


2, 


Appli 



2 



12 


113.5 


6. 


7 


1279 


4 


us- 


-09-723-096-2 


Sequence 


2, Appli 


13 


113 


6. 


6 


2184 


4 


us- 


09-417-485D-6 


Sequence 


6, Appli 


14 


105 


6. 


2 


586 


2 


us- 


-08-630-822A-70 


Sequence 


70, Appl 


15 


105 


6. 


2 


586 


2 


us- 


09-005-069-70 


Sequence 


70, Appl 


16 


105 


6. 


2 


586 


4 


us- 


09-171-156A-30 


Sequence 


30, Appl 


17 


105 


6. 


2 


586 


4 


us- 


09-004-730A-30 


Sequence 


30, Appl 


18 


105 


6. 


2 


586 


4 


us- 


08-981-799A-30 


Sequence 


30, Appl 


19 


103.5 


6. 


1 


245 


4 


us- 


09-399-913-4 


Sequence 


4, Appli 


20 


103.5 


6. 


1 


245 


4 


us- 


09-298-731-4 


Sequence 


4, Appli 


21 


103 


6. 


0 


387 


4 


us- 


09-328-352-5367 


Sequence 


5367, Ap 


22 


103 


6. 


0 


2662 


4 


us- 


09-595-684B-31 


Sequence 


31, Appl 


23 


102.5 


6. 


0 


975 


4 


us- 


09-914-259-19 


Sequence 


19, Appl 


24 


102.5 


6. 


0 


1098 


3 


us- 


08-923-992A-8 


Sequence 


8, Appli 


25 


102.5 


6. 


0 


1164 


3 


us- 


08-923-992A-10 


Sequence 


10, Appl 


26 


102.5 


6. 


0 


1388 


4 


us- 


09-572-191-2 


Sequence 


2, Appli 


27 


102.5 


6. 


0 


1388 


4 


us- 


09-723-262-2 


Sequence 


2, Appli 


28 


102.5 


6. 


0 


1388 


4 


us- 


09-723-219-2 


Sequence 


2 , App 1 i 


29 


102 


6. 


0 


474 


3 


us- 


08-387-117-6 


Sequence 


6, Appli 


30 


102 


6. 


0 


1128 


3 


us- 


08-923-992A-6 


Sequence 


6, Appli 


31 


101 


5. 


9 


1147 


3 


us- 


08-470-260-5 


Sequence 


5, Appli 


32 


101 


5. 


9 


1147 


3 


us- 


08-471-491-5 


Sequence 


5, Appli 


33 


101 


5. 


9 


1147 


3 


us- 


08-466-662-5 


Sequence 


5, Appli 


34 


101 


5. 


9 


3289 


2 


us- 


08-477-451-2 


Sequence 


2, Appli 


35 


99 . 5 


5. 


8 


1164 


3 


us- 


08-923-992A-2 


Sequence 


2, Appli 


36 


99 


5. 


8 


323 


4 


us- 


09-134-001C-3133 


Sequence 


3133, Ap 


37 


98 


5 . 


8 


1048 


3 


us- 


09-356-952-5 


Sequence 


5, Appli 


38 


97 


5. 


7 


2482 


1 


us- 


08-328-254-6 


Sequence 


6, Appli 


39 


97 


5. 


7 


3248 


1 


us- 


08-353-700-1 


Sequence 


1, Appli 


40 


97 


5. 


7 


3248 


5 


PCT 


-US95-16216-1 


Sequence 


1, Appli 


41 


96 


5. 


6 


1183 


4 


us- 


09-107-532A-6680 


Sequence 


6680, Ap 


42 


95.5 


5. 


6 


967 


4 


us- 


09-914-259-21 


Sequence 


21, Appl 


43 


95.5 


5. 


6 


1027 


4 


us- 


09-914-259-27 


Sequence 


27, Appl 


44 


95 


5. 


6 


564 


4 


us- 


09-198-452A-601 


Sequence 


601, App 


45 


95 


5. 


6 


956 


4 


us- 


09-914-259-17 


Sequence 


17, Appl 



ALIGNMENTS 



RESULT 1 
US-09-190-965-1 

; Sequence 1, Application US/09190965 
; Patent No. 6071721 
; GENERAL INFORMATION: 

APPLICANT: Tang, Y. Tom 
; APPLICANT : Guegler, Karl J. 
; APPLICANT: Corley, Neil C/ 

APPLICANT: Gorgone, Gina A. 
; TITLE OF INVENTION: CALCIUM BINDING PROTEIN 

FILE REFERENCE: PF-0635 US 
; CURRENT APPLICATION NUMBER: US/09/190,965 
; CURRENT FILING DATE: 1998-11-13 

NUMBER OF SEQ ID NOS : 5 

SOFTWARE: PERL Program 
; SEQ ID NO 1 

LENGTH: 337 
TYPE: PRT 



3 



ORGANISM: Homo sapiens 
FEATURE: - 

OTHER INFORMATION: 3734805 
US-09-190-965-1 

Query Match 100.0%; Score 1704; DB 3; Length 337; 

Best Local Similarity 100.0%; PrecL No. 1.6e-161; 

Matches 337; Conservative 0; Mismatches 0; Indels 0; Gaps 0 
Qy 1 MKKMPLFSKSHKNPAEIVKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKE 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1 MKKMPLFSKSHKNPAEIVKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKE 60 

Qy 61 PPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYIS 120 

1 I I I ! I I I M I I I I M I I I I I 1 I I I I I I I I I I I I I I I I I I | | | | | | | | | | | | | | | 

Db 61 PPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYIS 120 

Qy 121 AHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDIA 180 

I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 AHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDIA 180 

Qy 181 SDAFATFKDLLTRHKVLVADFLEQNYDTI.FEDYEKLLQSENYVTKRQSLKLLGELILDRH 240 

I I I I I I I I I I I I I I I I I I I i i I I I I I I I I I I I I I I I I I I I I I I i I M I I I I I I I I I I I I I 

Db 181 SDAFATFKDLLTRHKVLVADFLEQNYDTI FEDYEKLLQSENYVTKRQSLKLLGELILDRH 240 

Qy 241 NFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQPK 300 

I I I I M I I I I I I II I I I II I I I I I I I I I I I I I | | | | | | | | | | | | | | | | | M | M | | | | | | 
Db 241 NFAIMT KYI SKPENLKLMMNLLRDKSPNIQFEAFHVFKV EVAS PHKTQPIVEILLKNQPK 300 

Qy 301 LI EFLS S FQKERTDDEQFADEKNYLI KQI RDLKKTAP 337 

I I I I I I I I I I I I I i I I I I i I I I I I I I I I I I I I I I I i I 
Db 301 LI EFLS S FQKERTDDEQFADEKNYLI KQI RDLKKTAP 337 



RESULT 2 
US-09-470-253-1 

; Sequence 1, Application US/09470253 

; Patent No. 6365371 

; GENERAL INFORMATION: 

; APPLICANT: Tang, Y. Tom 

; APPLICANT: Guegler, Karl J. 

; APPLICANT: Corley, Neil C. 

; APPLICANT: Gorgone, Gina A. 

; TITLE OF INVENTION: CALCIUM BINDING PROTEIN 
; FILE REFERENCE: PF-0635 US 

; CURRENT APPLICATION NUMBER: US/09/470,253 

; CURRENT FILING DATE: 1999-12-22 

; PRIOR APPLICATION NUMBER: 09/190,965 

; PRIOR FILING DATE: 1998-11-13 

; NUMBER OF SEQ ID NOS : 5 

SOFTWARE: PERL Program 
; SEQ ID NO 1 

LENGTH: 337 

TYPE: PRT 

ORGANISM: Homo sapiens 
FEATURE: - 
; OTHER INFORMATION: 3734805 
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US-09-470-253-1 



Query Match 100.0%; Score 1704; DB 4; Length 337; 

Best Local Similarity 100.0%; Pred. No. 1.6e-161; 

Matches 337; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MKKMPLFSKSHKNPAEiVKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKE 60 

I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | I I I I I 

Db 1 MKKMPLFSKSHKNPAEIVKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKE 60 

Qy 61 PPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYIS 12 0 

I M I I I I I I I I I ! I I I I I I II I I I I I I I I I I I I I I I I I I I I I j | | | | | | | | | | | | | | | | | 
Db 61 PPTEAVAQLAQELYSSGLLWLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYIS 120 

Qy 121 AHPHI LFMLLKGYEAPQIALRCGIMLRECI RHEPLAKI I LFSNQFRDFFKYVELSTFDIA 180 

I I I I I I I I I I I I I I ! I I I I I I I I I 1 I I I I I I I I I I I I f I I I I I I I I I I I I I I I I I I I I I I 
Db 121 AHPHILFMLLKGYEAPQIALRCGIMLRECI RHEPLAKI I LFSNQFRDFFKYVELSTFDIA 180 

Qy 181 SDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSLKLLGELILDRH 240 

I I I I I I I I I I I I I I i I I I I I I M I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 SDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSLKLLGELILDRH 240 

Qy 241 NFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQPK 300 

I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | | | | | | | | | | | | | | | HI | | | | | 
Db 241 NFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQPK 300 

Qy 301 LIEFLSSFQKERTDDEQFADEKNYLIKQIRDLKKTAP 337 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 

Db 301 LIEFLSSFQKERTDDEQFADEKNYLI KQIRDLKKTAP 337 



RESULT 3 
US-09-190-965-3 

; Sequence 3, Application US/09190965 

; Patent No. 6071721 

; GENERAL INFORMATION: 

; APPLICANT: Tang, Y. Tom 

; APPLICANT: Guegler, Karl J. 

; APPLICANT: Corley, Neil C. 

; APPLICANT : Gorgone, Gina A. 

TITLE OF INVENTION: CALCIUM BINDING PROTEIN 

FILE REFERENCE: PF-0635 US 

CURRENT APPLICATION NUMBER: US/09/190,965 
; CURRENT FILING DATE: 1998-11-13 
; NUMBER OF SEQ ID NOS : 5 

SOFTWARE: PERL Program 
; SEQ ID NO 3 
; LENGTH: 341 
TYPE: PRT 
ORGANISM: Mus sp . 
FEATURE: - 

OTHER INFORMATION: g262934 
US-09-190-965-3 

Query Match 80.8%; Score 1376; DB 3; Length 341; 

Best Local Similarity 80.7%; Pred. No. 7.3e-129; 

Matches 272; Conservative 32; Mismatches 29; Indels 4; Gaps 2; 
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Qy 4 MPL-FSKSHKNPAEIVKILKDNLAILEKQ DKKTDKASEEVSKSLQAMKEILCGTNEK 59 

II I I I i I : I I : I I I I I : : : I : I I I I III : I I : I I I I I : I I I I I I I I I I II 

Db 1 MPFPFGKSHKSPADIVKNLKESMAVLEKQDISDKKAEKATEEVSKNLVAMKEILYGTNEK 60 

Qy 60 EPPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYI 119 

M I I I I I I I I I I I I : I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I 
Db 61 EPQTEAVAQLAQELYNSGLLGTLVADLQLIDFEGKKDVAQIFNNILRRQIGTRTPTVEYI 120 

Qy 120 SAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDI 179 

: I I I I I I I I I I : I : I I I I I I I I I I I I I I I I I I I I I I : I II I I I : I I I : I I I I I 
Db 121 CTQQNILFMLLKGYESPEIALNCGIMLRECIRHEPLAKIILWSEQFYDFFRYVEMSTFDI 180 

Qy 180 ASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSLKLLGELILDR 239 

I I I I I I M I I I I I I I I : I I : I I I I : I I I : I I I I I I I I I I I I I I I I I I I I I I : I I I 
Db 181 ASDAFATFKDLLTRHKLLSAEFLEQHYDRFFSEYEKLLHSENYVTKRQSLKLLGELLLDR 240 

Qy 240 HNFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQP 299 

III I I I I I I I II I I I II I I I I I I I I I I I I I I I I I I I I I I I I : I : I I I I I :: I I I I i I 

Db 241 HNFTIMTKYISKPENLKLMMNLLRDKSRNIQFEAFHVFKVFVANPNKTQPILDILLKNQT 300 

Qy 300 KLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLKKTA 336 

I I I I I I I II : I I : I I I I III I I : I I I I : I I : I 
Db 301 KLIEFLSKFQNDRTEDEQFNDEKTYLVKQIRNLKRAA 337 



RESULT 4 
US-09-470-253-3 

; Sequence 3, Application US/09470253 

; Patent No. 6365371 

; GENERAL INFORMATION: 

; APPLICANT: Tang, Y. Tom 

; APPLICANT: Guegler, Karl J. 

; APPLICANT: Corley, Neil C. 

; APPLICANT: Gorgone, Gina A. 

; TITLE OF INVENTION: CALCIUM BINDING PROTEIN 

FILE REFERENCE: PF-0635 US 
; CURRENT APPLICATION NUMBER: US/ 09/470, 253 
; CURRENT FILING DATE: 1999-12-22 
; PRIOR APPLICATION NUMBER: 09/190,965 

PRIOR FILING DATE: 1998-11-13 
; NUMBER OF SEQ ID NOS : 5 

SOFTWARE: PERL Program 
; SEQ ID NO 3 

LENGTH: 341 
TYPE: PRT 
ORGANISM: Mus sp . 
FEATURE : - 

OTHER INFORMATION: g262934 
US-09-470-253-3 

Query Match 80.8%; Score 1376; DB 4; Length 341; 

Best Local Similarity 80.7%; Pred. No. 7.3e-129; 

Matches 272; Conservative 32; Mismatches 29; Indels 4; Gaps 2; 

Qy 4 MPL-FSKSHKNPAEIVKILKDNLAILEKQ DKKTDKASEEVSKSLQAMKEILCGTNEK 59 

II I I I I I : I I : I I I I I : : : ! : I I I I | | | : I I : I I I I I : I MINI Mill 
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Db 


1 


Qy 


60 


Db 


61 


Qy 


120 


Db 


121 


Qy 


180 


Db 


181 


Ov 


240 


Db 


241 


Qy 


300 


Db 


301 



1 MPFPFGKSHKSPADIVKNLKESMAVLEKQDISDKKAEKATEEVSKNLVAMKEILYGTNEK 60 

EPPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYI 119 
M I I I I I I I I M I I : I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I | | : I I I I I I 

EPQTEAVAQLAQELYNSGLLGTLVADLQLIDFEGKKDVAQIFNNILRRQIGTRTPTVEYI 120 



I I I I I I I I I I : I : I I I II I I II II I I I I I I I I | | | : | | | I I I : I I I : i I I II 



ASDAFATFKDLLTRHKVLVADFLEQNYDTI FEDYEKLLQSENYVTKRQSLKLLGELILDR 239 
I I I i I II I I I I I I I I I : I I : I I I I : I I I : I I I I I I I I I I I I f I I I I I I I I I : I I I 
ASDAFATFKDLLTRHKLLSAEFLEQHYDRFFSEYEKLLHSENYVTKRQSLKLLGELLLDR 24 0 



II I I I I I I I I I ! I I I I I I I I I I I I I I I i I I I I I I M I I I I : I : I I I I I :: I I I I I I 



I I I I I I I I I : I I : I I I I III I I : I I I I : I I : I 



RESULT 5 
US-09-190-965-4 

; Sequence 4, Application US/09190965 

; Patent No. 6071721 

; GENERAL INFORMATION: 

; APPLICANT : Tang, Y. Tom 

; APPLICANT: Guegler, Karl J. 

; APPLICANT : Corley, Neil C. 

APPLICANT: Gorgone, Gina A. 
; TITLE OF INVENTION: CALCIUM BINDING PROTEIN 

FILE REFERENCE: PF-0635 US 
; CURRENT APPLICATION NUMBER: US/ 09/190, 965 
; CURRENT FILING DATE: 1998-11-13 

NUMBER OF SEQ ID NOS : 5 

SOFTWARE: PERL Program 
; SEQ ID NO 4 
; LENGTH: 339 
TYPE: PRT 

ORGANISM: Drosophila melanogaster 
FEATURE : - 

OTHER INFORMATION: gl794137 
US-09-190-965-4 



Query Match 65.1%; Score 1109; DB 3; Length 339; 

Best Local Similarity 65.0%; Pred. No. 2.8e-102; 

Matches 217; Conservative 59; Mismatches 54; Indels 4; Gaps 3; 

Qy 4 MPLFSKSHKNPAEIVKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPT 63 

i I I I I I I : I I : I I I I : : II I : I : I I | : | | | : | : : | : | | : : : | | | 
Db 1 MPL FGKSQKS PVELVKS LKEAI NALEAGDRKVEKAQEDVSKNLVS I KNMLHGS S DAEP PA 60 

Qy 64 E-AVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYISAH 122 

: I I I I : I I I I : I I I : II : I I I I I I I I I I I I I : I I I I I I I I I I I II I i 
Db 61 DYWAQLSQELYNSNLLLLLIQNLHRIDFEGKKHVALIFNNLLRRQIGTRSPTVEYICTK 120 
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Qy 123 PHILFMLLKGYE — APQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDIA 180 

11111:111 hill I I I I I I I : I I I II : I I :: I I I : I I I : I I I I I I 
Db 121 PEILFTLMAGYEDAHPEIALNSGTMLRECARYEALAKIMLHSDEFFKFFRYVEVSTFDIA 18 0 

Qy 181 SDAFATFKDLLTRHKVLVADFLEQNYDTIF-EDYEKLLQSENYVTKRQSLKLLGELILDR 239 

I I I I : I I I : I I I I I I : I I : I I : III I : I : : I I I I I I I I : I I I I I I I I I I : I I I 
Db 181 SDAFSTFKELLTRHKLLCAEFLDANYDKFFSQHYQRLLNSENYVTRRQSLKLLGELLLDR 240 

Qy 240 HNFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQP 299 

Ml : I I : I I I : I I I I I I I I I : I : : t I I I I I I I I I I I I I I I I : I : I : I I : : I I I : I I 
Db 241 HNFTVMTRYISEPENLKLMMNMLKEKSRNIQFEAFHVFKVFVANPNKPKPILDILLRNQT 300 

Qy 300 KLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLK 333 

I I : : I I : : I : I : : I I I I I I I I I I II I : : I I 
Db 301 KLVDFLTNFHTDRSEDEQFNDEKAYLIKQIKELK 334 



RESULT 6 
US-09-470-253-4 

; Sequence 4, Application US/09470253 

; Patent No. 6365371 

; GENERAL INFORMATION: 

; APPLICANT: Tang, Y. Tom 

; APPLICANT: Guegler, Karl J. 

; APPLICANT: Corley, Neil C. 

APPLICANT: Gorgone, Gina A. 
; TITLE OF INVENTION: CALCIUM BINDING PROTEIN 

FILE REFERENCE: PF-0635 US 
; CURRENT APPLICATION NUMBER: US/09/470, 253 
; CURRENT FILING DATE: 1999-12-22 

PRIOR APPLICATION NUMBER: 09/190,965 
; PRIOR FILING DATE: 1998-11-13 
; NUMBER OF SEQ ID NOS : 5 

SOFTWARE: PERL Program 
; SEQ ID NO 4 

LENGTH: 339 
TYPE: PRT 

; ORGANISM: Drosophila melanogaster 
FEATURE : - 

OTHER INFORMATION: gl7 94137 
US-09-470-253-4 

Query Match 65.1%; Score 1109; DB 4; Length 339; 

Best Local Similarity 65.0%; Pred. No. 2.8e~102; 

Matches 217; Conservative 59; Mismatches 54; Indels 4; Gaps 3; 



Qy 4 MPLFSKSHKNPAEIVKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPT 63 

I I I I II I : I j : I I ||: : II I : I : I I I : I I I : I : : I : I I : : : III 
Db 1 MPLFGKSQKSPVELVKSLKEAINALEAGDRKVEKAQEDVS KNLVSIKNMLHGSSDAEPPA 60 

Qy 64 E-AVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYISAH 122 

: I I I I : I I I I : I II: II : I I I I I I I I I I I II : I I I I I II I I I I I I I I 
Db 61 DYWAQLSQELYNSNLLLLLI QNLHRI DFEGKKHVALI FNNLLRRQI GTRS PTVEYI CTK 12 0 

Qy 123 PHILFMLLKGYE — APQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDIA 180 

I I I I I : I I I hill I I I I I I I : I I I I I : I I :: I I I : I I I : I I I I I I 
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Db . 121 PEILFTLMAGYEDAHPEIALNSGTMLRECARYEALAKIMLHSDEFFKFFRYVEVSTFDIA 180 

Qy 181 SDAFATFKDLLTRHKVLVADFLEQNYDTIF-EDYEKLLQSENYVTKRQSLKLLGELILDR 239 

I I I I : I I I : I I i I I I : I I : I I : III I : I : : I I I I I I I I : I I I I I I I I I I : I I I 
Db 181 SDAFSTFKELLTRHKLLCAEFLDANYDKFFSQHYQRLLNSENYVTRRQSLKLLGELLLDR 240 

Qy 240 HNFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQP 299 

I I I : M : I I I : I I I I I I I II : I : : I I I I I I I I I I I I I I I I I : I : | : | I : : I I I : I I 
Db 241 HNFTVMTRYISEPENLKLMMNMLKEKSRNIQFEAFHVFKVFVANPNKPKPILDILLRNQT 300 

Qy 300 KLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLK 333 

I I : : I I : : I : I : : I I I I Ml I I I I I I : : I I 
Db 301 KLVDFLTNFHTDRSEDEQFNDEKAYLI KQI KELK 334 



RESULT 7 
US-09-190-965-5 

; Sequence 5, Application US/09190965 

; Patent No. 6071721 

; GENERAL INFORMATION: 

; APPLICANT : Tang, Y. Tom 

; APPLICANT: Guegler, Karl J. 

; APPLICANT: Corley, Neil C. 

; APPLICANT: Gorgone, Gina A. 

; TITLE OF INVENTION: CALCIUM BINDING PROTEIN 
; FILE REFERENCE: PF-0635 US 

; CURRENT APPLICATION NUMBER: US/ 09/ 1 9 0 , 9 65 
; CURRENT FILING DATE: 1998-11-13 
; NUMBER OF SEQ ID NOS : 5 

SOFTWARE: PERL Program 
; SEQ ID NO 5 

LENGTH: 377 
TYPE: PRT 

ORGANISM: Caenorhabditis elegans 
FEATURE: - 

OTHER INFORMATION: gl255838 
US-09-190-965-5 

Query Match 62.4%; Score 1063.5; DB 3; Length 377; 

Best Local Similarity 60.5%; Pred. No. l.le-97; 

Matches 211; Conservative 53; Mismatches 68; Indels 17; Gaps 3; 



Qy 4 MP-LFSKSHKNPAEIVKILKDNLAILEK Q DKKT DKAS E EVS K S LQAM 49 

I I I I I I I I : I I : : i I I : : I I I : I I I I I I I : I I I I : : : 

Db 1 MPLLFGKSHKSPADWKTLREVLTILDKLPPPKLDKDGNIQSDKKYDKALDEVSKNVAMI 60 

Qy 50 KEILCGTNEKEPPTE AVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILR 106 

I : I : I I : I I I I I I I I : I : : : I II I : I I I I I I I I I I I : I I 

Db 61 KSFIYGNDSAEPSSEHWQVAQLAQEVYNANILPMLIKMLPKFEFECKKDVGQIFNNLLR 120 

Qy 107 RQIGTRSPTVEYISAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFR 166 

II I I I II I I I I I : I I II I : : I I ! Ill I 1 : I I I I ill: I I I I I I : I : I 

Db 121 RQIGTRS PTVEYLGARPEI LIQLVQGYSVPDI ALTCGLMLRESI RHDHLAKI I LYSDVFY 18 0 

Qy 167 DFFKYVELSTFDIAS DAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKR 226 

II II: I I I : I I I I : I I I : I I I II : : I : I I : I I I I I I : II I : I I I I : I 
Db 181 TFFLYVQSEVFDISSDAFSTFKELTTRHKAI IAEFLDSNYDTFFAQYQNLLNSKNYVTRR 240 
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Qy 



227 QSLKLLGELILDRHNFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHK 28 6 





Db 



241 QSLKLLGELLLDRHNFNTMTKYISNPDNLRLMMELLRDKSRNIQYEAFHVFKVFVANPNK 300 



Db 



Qy 



287 TQPIVEILLKNQPKLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLKKT 335 

: I I : I I : I : I I : I I I I ! : I I I I I I I III I I I I I I : : : I : 
301 PKPISDILNRNREKLVEFLSEFHNDRTDDEQFNDEKAYLIKQIQEMKSS 34 9 



RESULT 8 
US-09-470-253-5 

; Sequence 5, Application US/09470253 

; Patent No. 6365371 

; GENERAL INFORMATION: 

; APPLICANT: Tang, Y. Tom 

; APPLICANT: Guegler, Karl J. 

; APPLICANT: Corley, Neil C. 

; APPLICANT: Gorgone, Gina A. 

; TITLE OF INVENTION: CALCIUM BINDING PROTEIN 
; FILE REFERENCE: PF-0635 US 

; CURRENT APPLICATION NUMBER: US/09/470,253 

; CURRENT FILING DATE: 1999-12-22 

; PRIOR APPLICATION NUMBER: 09/190,965 

; PRIOR FILING DATE: 1998-11-13 

; NUMBER OF SEQ ID NOS : 5 

SOFTWARE: PERL Program 
; SEQ ID NO 5 

LENGTH: 377 

TYPE: PRT 

ORGANISM: Caenorhabditis elegans 
FEATURE: - 

OTHER INFORMATION: gl255838 
US-09-470-253-5 

Query Match 62.4%; Score 1063.5; DB 4; Length 377; 

Best Local Similarity 60.5%; Pred. No. l.le-97; 

Matches 211; Conservative 53; Mismatches 68; Indels 17; Gaps 3; 
Qy 4 MP-LFSKSHKNPAEIVKILKDNLAILEK QDKKTDKASEEVSKSLQAM 49 



Db 



ii i • i i • • i i i • • i i i • i iii mi . ii ii • • 

1 MPLLFGKSHKSPADWKTLREVLTILDKLPPPKLDKDGNIQSDKKYDKALDEVSKNVAMI 60 



50 KEILCGTNEKEPPTE AVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILR 106 



Db 



i ■ i - ii • i i i i i i i i • i • • -i ii i -ii i i i i i i i i i • i i 

61 KSFIYGNDSAEPSSEHWQVAQLAQEVYNANILPMLIKMLPKFEFECKKDVGQIFNNLLR 120 



Qy 



107 RQIGTRSPTVEYISAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFR 166 

I I I I I I I I I I I I : I I II I : : I I I III I I : I I I I III: I I I I II : I : I 
121 RQIGTRSPTVEYLGARPEILIQLVQGYSVPDIALTCGLMLRESIRHDHLAKIILYSDVFY 180 



Db 



Qy 



Db 



167 DFFKYVELSTFDIAS DAFATFKDLLTRH WLVADFLEQNYDTIFEDYEKLLQSENYVTKR 226 

II II: I I I : I II I : I I I : I I I I I : : I : I I : I I I I I I : II I : I I I I : I 
181 TFFLYVQSEVFDISSDAFSTFKELTTRHKAIIAEFLDSNYDTFFAQYQNLLNSKNYVTRR 240 



227 QSLKLLGELILDRHNFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHK 286 
I I I I I I I I I : I I I I I I I I I I I I I : I I : I I I I I I I I I I I I : I I I I I I I I I I I : I : I 
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Db 241 QSLKLLGELLLDRHNFNTMTKYISNPDNLRLMMELLRDKSRNIQYEAFHVFKVFVANPNK 300 

Qy 2 87 TQPIVEILLKNQPKLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLKKT 335 

: I I : I I : I : I I : I I I I I MINIM Ml I I I I I I : : M : 
Db 301 PKPISDILNRNREKLVEFLSEFHNDRTDDEQFNDEKAYLIKQIQEMKSS 349 

RESULT 9 

US-09-914-259-11 

Sequence 11, Application US/09914259 
Patent No. 6495336 
GENERAL INFORMATION: 
APPLICANT: Makowski, Lee 
APPLICANT: Hyman, Paul 
APPLICANT: Williams, Mark 

TITLE OF INVENTION: STAGED ASSEMBLY OF NANOSTRUCTURES 
FILE REFERENCE: 8471-010-999 
CURRENT APPLICATION NUMBER: US/09/ 914 , 259 
CURRENT FILING DATE: 2000-11-21 
NUMBER OF SEQ ID NOS : 180 

SOFTWARE: FastSEQ for Windows Version 4.0 
SEQ ID NO 11 
LENGTH: 387 8 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-09-914-259-11 

Query Match 7.5%; Score 128.5; DB 4; Length 3878; 

Best Local Similarity 20.1%; Pred. No. 0.0037; 

Matches 77; Conservative 75; Mismatches 116; Indels 115; Gaps 15; 

Qy 18 VKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPTEAVAQLAQELYSSG 77 

:: I I I I I I M : I I M : : : I : I I : M I 
Db 664 IEKLKDNLGIHYKQ — QIDGLQNEMSQKIETMQ FEKDNLITKQNQLILE 710 

Qy 78 LLVTLIADLQ--LIDFEGKKDVTQIFNNILRRQI GTRSPTVEYISAHPHI 125 

: : : f I I I : : : : : | | I I : : : I II I : : : 

Db 711 — I SKLKDLQQSLVNSKSEEMTLQI — NELQKEI EI LRQEEKEKGTLEQEVQELQLKTEL 766 

Qy 126 LFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDIASDAFA 185 

I Ml | : : | : | | | 

Db 767 LEKQMKEKE NDLQEKFAQLEAEN-SILKDEKK 797 

Qy 186 TFKDLLTRH KVLVADFLE-QNYDTI FEDYEKLLQSENYVTKRQSLKLLGELI L 237 

I M M I : : : I : : : : I : : M : M II I M : M I : 

Db 798 TLEDMLKIHTPVSQEERLIFLDSIKSKSKDSVWEKEIEILIEENEDLKQQCIQLNEEIEK 857 

Qy 238 DRHNFAIMTK YISKPENLKLMMNLLRD 264 

I : I : I I II : I : M I 

Db 858 QRNTFSFAEKNFEVNYQELQEEYACLLKVKDDLEDSKNKQELEYKSKLKALNEELHLQRI 917 

Qy 265 KSPNIQFEA — FHVFKVFVASPHKTQPIVEILLKNQPKLIEFLSSFQKERTD-DEQFAD- 320 

: : : : I I I I I M : I : : I : MM I : : I : : : : M 
Db 918 NPTTVKMKSSVFDEDKTFVA ETLEMGEWEKDTTELMEKLEVTKREKLELSQRLSDL 974 

Qy 321 EKNYLIKQIRDLKK 334 

I : M : : : : I I : 
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Db 975 SEQLKQKHGEISFLNEEVKSLKQ 997 



RESULT 10 
US-09-724-517-2 

; Sequence 2, Application US/09724517 

; Patent No. 6379941 

; GENERAL INFORMATION: 

; APPLICANT: Beraud, Christophe 

; APPLICANT: Freedman, Richard 

; TITLE OF INVENTION: No. 6379941el motor proteins and methods for 
; TITLE OF INVENTION: their use 

FILE REFERENCE: 1031 
; CURRENT APPLICATION NUMBER: US/09/724,517 
; CURRENT FILING DATE: 2000-11-27 
; PRIOR APPLICATION NUMBER: US/09/641,807 

PRIOR FILING DATE: 2000-08-17 
; NUMBER OF SEQ ID NOS : 4 

SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 2 

LENGTH: 127 9 
TYPE: PRT 
ORGANISM: Human 
; FEATURE : 

NAME/ KEY : VARIANT 
LOCATION: ( 4 09 ) . . . ( 4 36 ) 

OTHER INFORMATION: Xaa = any amino acid 
US-09-724-517-2 



Query Match 6.7%; Score 113.5; DB 4; Length 1279; 

Best Local Similarity 19.3%; Pred. No. 0.024; 

Matches 87; Conservative 61; Mismatches 137; Indels 165; Gaps 14; 

Qy 23 DNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPTEAVAQLAQELYSSGLLVTL 82 

I : I I : : I I I : I : I I : : I : : : I III | | 

Db 794 DHLQKLDEQKKWLDEEVEKVLNQRQELEELEADLKKREAIVSKKEALLQE — KSHLENKK 851 

Qy 83 IADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYISA 121 

: I : : : I I : : I : I : : : : : : : : | 
Db 852 LRSSQALNTDSLKISTRL — NLLEQELSEKNVQLQTSTAEEKTKI SEQVEVLQKEKDQLQ 909 

Qy 122 HPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKI ILFS 162 

I : I I I : I I I : I : I I : : : I 

Db 910 KRRHDVDEKLKNGRVLSPEEEHVLFQLEEGIEALEAAIE YRNESIQNRQKSLRASFH 966 

Qy 163 NQFRDFFKYVE LSTFDIASDAFATFKDLLT RHKVLVAD 200 

II : I I I : I : I I : : : III 

Db 967 NLSRGEANVLEKLACLSPVEIRTILFRYFNKVVNLREAERKQQLYNEEMKMKVLERDNMV 1026 



Qy 201 FLEQNYDTI FEDYEKLLQS 219 

! I I : : I : I I : I : 

Db 1027 RELESALDHLKLQCDRRLTLQQKEHEQKMQLLLHHFKEQDGEGIMETFKTYEDKIQQLEK 1086 

Qy 220 ENYVTKRQS-- LKLLGELILDRHNFAIM TKYISK 251 

: I I : I : I : I I I I I I : I : 

Db 1087 DLYFYKKTSRDHKKKLKELVGEAI — RRQLAPSEYQEAGDGVLKPEGGGMLSEELKWASR 1144 
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Qy 252 PENLKLMMNLLRDKSPNIQFEAFHVFKVFVAS PHKTQPIVEI LLKNQPKLI EFLS S F 308 

II : : I I I : : : I I : I It : I : : I : I II 

Db 1145 PESMKLSG REREMDSS ASSLRTQPNPQKLWEDI PELPPIHSSLAPP 1190 

Qy 309 QKERTDDEQFADEKNYLIKQIR 330 

I I I I I I : I I I : 

Db 1191 SGHMLGNENKTETDDNQFTKSHSRLSSQIQ 1220 



RESULT 11 
US-09-641-807A-2 

Sequence 2, Application US/09641807A 
Patent No. 6440731 
GENERAL INFORMATION: 
APPLICANT: Beraud, Christophe 
APPLICANT: Freedman, Richard 

TITLE OF INVENTION: No. 6440731el motor proteins and methods for 
TITLE OF INVENTION: their use 
FILE REFERENCE: 1031 

CURRENT APPLICATION NUMBER: US/ 09/ 64 1 , 8 07A 
CURRENT FILING DATE: 2000-08-17 
NUMBER OF SEQ ID NOS : 4 

SOFTWARE: FastSEQ for Windows Version 4.0 
SEQ ID NO 2 
LENGTH: 1279 
TYPE: PRT 
ORGANISM: Human 
FEATURE : 

NAME/ KEY : VARIANT 
LOCATION: (409) . . . (446) 

OTHER INFORMATION: Xaa = any amino acid 
US-09-641-807A-2 

Query Match 6.7%; Score 113.5; DB 4; Length 1279; 

Best Local Similarity 19.3%; Pred. No. 0.024; 

Matches 87; Conservative 61; Mismatches 137; Indels 165; Gaps 14; 

Qy 23 DNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPTEAVAQLAQELYSSGLLVTL 82 

I : I I : : I I I : I : I I : : I : : : I III I ! 

Db 7 94 DHLQKLDEQKKWLDEEVEKVLNQRQELEELEADLKKREAIVSKKEALLQE — KSHLENKK 851 

Qy 83 IADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYISA 121 

: | : : : | | : : | : | : : : : : : : : I 
Db 852 LRSSQALNTDSLKISTRL--NLLEQELSEKNVQLQTSTAEEKTKISEQVEVLQKEKDQLQ 909 

Qy 122 HPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKI ILFS 162 

I : I i I : I I I : I : I I : : : I 

Db 910 KRRHDVDEKLKNGRVLSPEEEHVLFQLEEGIEALEAAIE YRNESIQNRQKSLRASFH 966 

Qy 163 NQFRDFFKYVE LST FDI AS DAFAT FKDLLT RHKVLVAD 2 00 

II : I | | : | : | | : : : I I I I 

Db 967 NLSRGEANVLEKI^CLSPVEIRTILFRYFNKVVNLREAERKQQLYNEEMKMKVLERDNMV 1026 

Qy 201 FLEQNYDTI FEDYEKLLQS 219 

I I I : : I : I I : I : 

Db 1027 RELESALDHLKLQCDRRLTLQQKEHEQKMQLLLHHFKEQDGEGIMETFKTYEDKIQQLEK 108 6 
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Qy 220 ENYVTKRQS LKLLGELILDRHNFAIM TKYISK 251 

: I I : I : I : I I I I I I : I : 

Db 1087 DL YFYKKTS RDHKKKLKELVGEAI - - RRQLAP S E YQEAGDGVLKPEGGGMLS EELKWAS R 1144 

Qy 252 PENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQPKLIEFLSSF 308 

I I : : I I I : : : II : I I I : I : : I : I II 

Db 1145 PESMKLSG REREMDSS ASSLRTQPNPQKLWEDIPELPPIHSSLAPP 1190 

Qy 309 QKERTDDEQFADEKNYLIKQIR 330 

I I I I I I : I II: 

Db 1191 SGHMLGNENKTETDDNQFTKSHSRLSSQIQ 1220 



RESULT 12 
US-09-723-096-2 

; Sequence 2, Application US/09723096 

; Patent No. 6448026 

; GENERAL INFORMATION : 

; APPLICANT: Beraud, Chris tophe 

; APPLICANT: Freedman, Richard 

; TITLE OF INVENTION: No. 6448026el motor proteins and methods for 
; TITLE OF INVENTION: their use 
; FILE REFERENCE: 1031 

; CURRENT APPLICATION NUMBER: US/09/723, 096 

; CURRENT FILING DATE: 2000-11-27 

; PRIOR APPLICATION NUMBER: US/ 0 9/ 64 1 , 8 07 

PRIOR FILING DATE: 2000-08-17 
; NUMBER OF SEQ ID NOS : 4 

SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 2 

LENGTH: 127 9 

TYPE: PRT 

ORGANISM: Human 

FEATURE : 
; NAME/ KEY: VARIANT 

LOCATION: ( 4 09 ) . . . ( 4 36 ) 
; OTHER INFORMATION: Xaa - any amino acid 
US-09-723-096-2 



Query Match 6.7%; Score 113.5; DB 4; Length 1279; 

Best Local Similarity 19.3%; Pred. No. 0.024; 

Matches 87; Conservative 61; Mismatches 137; Indels 165; Gaps 14; 

Qy 23 DNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPTEAVAQLAQELYSSGLLVTL 82 

1:1 I : : I I I : hi I : : I : : : I III II 

Db 794 DHLQKLDEQKKWLDEEVEKVLNQRQELEELEADLKKREAIVSKKEALLQE — KSHLENKK 851 

Qy 83 I ADLQLI DFEGKKDVTQI FNNI LRRQI GTRSPTVEYI SA 121 

: I : : : I I : : I : I : : : : : : : : I 
Db 852 LRSSQALNTDSLKI STRL — NLLEQELSEKNVQLQTSTAEEKTKISEQVEVLQKEKDQLQ 909 

Qy 122 HPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFS 162 

I : I I I : I I I : I : I I : : : I 

Db 910 KRRHDVDEKLKNGRVLSPEEEHVLFQLEEGIEALEAAIE YRNESIQNRQKSLRASFH 966 

Qy 163 NQFRDFFKYVE LSTFDIASDAFATFKDLLT RHKVLVAD 200 

II : I I I : | : | | :: : III 
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Db 



967 NLSRGEANVLEKl^CLSPVEIRTILFRYFNKVWLREAERKQQLYNEEMKMKVLERDNMV 1026 



Qy 201 FLEQNYDTI FEDYEKLLQS 219 

I II: : I : I I : I : 

Db 1027 RELESALDHLKLQCDRRLTLQQKEHEQKMQLLLHHFKEQDGEGIMETFKTYEDKIQQLEK 1086 

Qy 220 ENYVTKRQS LKLLGELILDRHNFAIM TKYISK 251 

: I I : I : I : I I I I I I : I : 

Db 1087 DL YFYKKTS RDHKKKLKELVGEAI - -RRQLAP S EYQEAGDGVLKPEGGGMLS EELKWAS R 1144 

Qy 252 PENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQPKLIEFLSSF 308 

M : : I I I : : : | | : | | | : | : : | : | | | 

Db 1145 PESMKLSG REREMDSS ASSLRTQPNPQKLWEDIPELPPIHSSLAPP 1190 

Qy 309 QKERTDDEQFADEKNYLIKQIR 330 

I I I I I I : I II: 

Db 1191 SGHMLGNENKTETDDNQFTKSHSRLSSQIQ 1220 



RESULT 13 
US-09-417-485D-6 

; Sequence 6, Application US/09417485D 

; Patent No. 6541202 

; GENERAL INFORMATION: 

; APPLICANT: Long, David M. 

; APPLICANT: Metz, Anneke M. 

; APPLICANT: Love, Ruschelle A. 

TITLE OF INVENTION: Telomerase Reverse Transcriptase (TERT) Genes 

FILE REFERENCE: 477 14-5009-US 
; CURRENT APPLICATION NUMBER: US/ 09/4 17 , 4 85D 

CURRENT FILING DATE: 2 002-06-14 
; NUMBER OF SEQ ID NOS : 4 9 
; SOFTWARE: Patentln Ver. 2.1 
; SEQ ID NO 6 
; LENGTH: 2184 

TYPE: PRT 
; ORGANISM: Plasmodium falciparum 
; FEATURE : 
; NAME/KEY: unsure 

LOCATION: (330).. (335) 

OTHER INFORMATION: Xaa at position 330 = Leu or lie; 
OTHER INFORMATION: Xaa at position 335 - Asp or Gly. 
US-09-417-485D-6 

Query Match 6.6%; Score 113; DB 4; Length 2184; 

Best Local Similarity 21.9%; Pred. No. 0.057; 

Matches 77; Conservative 58; Mismatches 14 0; Indels 76; Gaps 17; 

Qy 1 MKKMPLFSKSHKN PAEIV — KILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNE 58 

: : : I I : I : I | | : : : : I : : I I I I I : | 

Db 309 LPEIDFFSEDRKEKSSSVGYDXKKKNXSNIKRFHNKINRTKEEKKKKWN--KI I INRNNI 366 

Qy 59 KEPPTEAVAQLAQELYSSGLLVTLIAD LQLI DFEGKKDVTQI FNN 103 

: I : |:: :: | : |::: | | | ::| 

Db 367 LQHNT — TNKCKTFLFNKHIIFDKIENNNIPLFIYDLLNYIFKSDQTYFYHNNFIDEYKQ 424 

Qy 104 ILRRQI — GTRSPTVEYI — SAHPHILFMLLK GYEAPQIALRCGIMLRECI RHEPLA 156 
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: : II I : : : I t : I I : I I : I I : : | 

Db 425 KICKQIKCSTKKNDI SHI ITSRKENHLFHVQKLENNYKHPNI NKQLRKTKIL 476 

Qy 157 KIILFSNQFRDFFKYVELSTFDIASDAFATFKDLLTR-HKV L 197 

I : I I : : I I : I I I : I : I I : 

Db 477 KYVY — NYFKEFINNVINTKFGKI YRKFFPRKHILNKIHKI FKI IRLQI IKKYRI INIRM 534 

QY 198 VADFLEQN- YDTI FEDYE — KLLQSENYVTKR-QSLKLLGELILDRHNFAIMT 246 

I : : I I I i I : : I : I : : I : I I : : I M I : I I I I 

Db 535 NRKFIKQKVYDTFFKNYDFLSFSFKTYKIINFMVYITKKCI PIKLLG SKHNFKIFL 590 

Qy 247 KYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKN 297 

I : I 1:1 II : I : I I I I I I I 

Db 591 KNVKK FLLFNYKES FSLNQVMKNI KVKNI FQKKI S KYNI KNRI LLKN 637 



RESULT 14 
US-08-630-822A-70 

; Sequence 70, Application US/08630822A 

; Patent No. 5840695 

; GENERAL INFORMATION: 

APPLICANT: FRANK, GLENN R. 

APPLICANT: HUNTER, SHIRLEY WU 

APPLICANT: WALLENFELS, LYNDA 

TITLE OF INVENTION: NOVEL ECTOPARASITE SALIVA PROTEINS 
TITLE OF INVENTION: AND APPARATUS TO COLLECT SUCH PROTEINS 
; NUMBER OF SEQUENCES: 107 

CORRESPONDENCE ADDRESS: 

7VDDRESSEE : Sheridan Ross P.C. 

STREET: 1700 Lincoln Street, Suite 3500 

CITY: Denver 

STATE: Colorado 

COUNTRY: U.S.A. 

ZIP: 80203 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 
; SOFTWARE: Patentln Release #1.0, Version #1.25 

; CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/630, 822A 

FILING DATE: ll-APR-1996 

CLASSIFICATION: 435 
ATTORNEY/ AGENT INFORMATION: 

NAME: CONNELL, GARY J. 

REGISTRATION NUMBER: 32,020 

REFERENCE/ DOCKET NUMBER: 2618-17-C3 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (303) 863-9700 

TELEFAX: (303) 863-0223 
INFORMATION FOR SEQ ID NO: 70: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 586 amino acids 

TYPE: amino acid 

TOPOLOGY: linear 
; MOLECULE TYPE: protein 
FEATURE : 
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NAME/ KEY : Xaa - any amino acid 
LOCATION: 37 9 
US-08-630-822A-70 

Query Match 6.2%; Score 105; DB 2; Length 586; 

Best Local Similarity 20.0%; Pred. No. 0.054; 

Matches 77; Conservative 54; Mismatches 136; Indels 118; Gaps 15 

Qy 22 KDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPTEAVAQLAQELYSSGLLVT 81 

I : : I : : I : | | | : | | | I 

Db 205 KTKI EVI KEEERKI REERQEAREEEEQRKQAELALNAS SAAAEAS S — AQEL 254 

QY 82 LIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYISAHPHILFMLLKGYEAPQIALR 141 

II -HI I Mil I I : I I II: 

Db 255 LIDTAPVIDAEKTPKV ATSP-VESPLAPPEVLIM GAPK 291 

Qy 142 CGIMLRECIRHEPLAKI ILFSNQFRDFFKWELSTFDIASDAFATFKDLLTRHKVLVADF 201 

1 = 1 • ' s I I : I : I I : I : I I : : I 

Db 292 - TPVATEVDKNADEVEFTK-KDLEWEDALDTLSKDKNNLVIEKEVIKDI 339 

QY 2 02 LEQ NYDTI FEDYEKL — 216 

I : I I : : : I 

Db 34 0 KEEIADYQEDVEELKEAIVAAEKPKDEIKETKGAQRLLKXVNKMITKMDTWQEIESKES 399 

Qy 217 LQSENYVTKRQSL KLLGELILDRHNFAI-MTKYISKPENLKLMMNLL-- 262 

I : : I : I I I I I : :: I II I : I I I : I : I 

Db 400 EKKAKTLPLEAPRSATETQELDVRKERGEILIDELMDAIKKVKNVPDENRLKLIENILGR 459 

Qy 263 - - RDKS PN I Q FEAFHVFKVF VASPHKTQPIVEILLKNQPKLIEFLSSFQKER 312 

I I : I : I III : I : : I : : | I : : | | : | : 

Db 460 IDTDKDRHIKVE — DVLKVIDIVEKEDGIMSTKQLDELVQLLKKEE — VIELEEKKEKQE 515 

Qy 313 TDDEQFADEKNYLIKQIRDLKKTAP 337 

: : I I : III 

Db 516 SQQKSFVPPSETLHLESSQQKSTVP 540 



RESULT 15 
US-09-005-069-70 

; Sequence 70, Application US/09005069 
; Patent No. 5932470 

GENERAL INFORMATION: 

APPLICANT: FRANK, GLENN R. 

APPLICANT: HUNTER, SHIRLEY WU 

APPLICANT: WALLENFELS, LYNDA 

TITLE OF INVENTION: NOVEL ECTOPARASITE SALIVA PROTEINS 
TITLE OF INVENTION: AND APPARATUS TO COLLECT SUCH PROTEINS 
NUMBER OF SEQUENCES: 107 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Sheridan Ross P.C. 

STREET: 1700 Lincoln Street, Suite 3500 

CITY: Denver 

STATE: Colorado 

COUNTRY: U.S.A. 

ZIP: 80203 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
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COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/09/005,0 69 
; FILING DATE: 

CLASSIFICATION: 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/630,822 

FILING DATE: ll-APR-1996 
ATTORNEY/AGENT INFORMATION: 

NAME: CONNELL, GARY J. 
; REGISTRATION NUMBER: 32,020 

REFERENCE/DOCKET NUMBER: 2618-17-C3 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (303) 863-9700 

TELEFAX: (303) 863-0223 
INFORMATION FOR SEQ ID NO: 70: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 586 amino acids 

TYPE: amino acid 

TOPOLOGY: linear 
MOLECULE TYPE: protein 
FEATURE: 

NAME/ KEY : Xaa = any amino acid 
LOCATION: 379 
US-09-005-069-70 

Query Match 6.2%; Score 105; DB 2; Length 586; 

Best Local Similarity 20.0%; Pred. No. 0.054; 

Matches 77; Conservative 54; Mismatches 136; Indels 118; Gaps 15 

Qy 22 KDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPTEAVAQLAQELYSSGLLVT 81 

I ::::::::[ : : I : : | : | | | : Mil 

Db 205 KTKI EVI KEEERKI REERQEAREEEEQRKQAELALNAS SAAAEAS S — AQEL 254 

Qy 82 LIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYISAHPHILFMLLKGYEAPQIALR 141 

II : I I I I I! I I 11:11 II: 

Db 255 LI DTAPVI DAEKT PKV ATS P-VES PLAPPEVLIM GAPK 2 91 

Qy 142 CGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDIASDAFATFKDLLTRHKVLVADF 201 

I : I : : : I I : I : I I : I : I I : : I 

Db 292 - T PVAT EVDKNADEVE FT K- KD LEWEDALDT L S KD KNN LVI EKEVI KD I 339 

Qy 202 LEQ NYDTI FEDYEKL- - 216 

I : I I : : : I 

Db 340 KEEIADYQEDVEELKEAIVAAEKPKDEIKETKGAQRLLKXVNKMITKMDTWQEIESKES 399 

Qy 217 LQSENYVTKRQSL KLLGELILDRHNFAI-MTKYISKPENLKLMMNLL — 262 

Is: I : I I I I I : : : I II I : I I I : I : I 

Db 400 EKKAKTLPLEAPRSATETQELDVRKERGEILI DELMDAIKKVKNVPDENRLKLIENILGR 459 

Qy 263 — RDKSPNIQFEAFHVFKVF VASPHKTQPIVEILLKNQPKLIEFLSSFQKER 312 

I I : I : I III : I : : I : : I I : : I I : 1 : 

Db 4 60 I DTDKDRHI KVE- -DVLKVI DI VEKEDGIMSTKQLDELVQLLKKEE- -VI ELEEKKEKQE 515 

Qy 313 TDDEQFADEKNYLIKQIRDLKKTAP 337 
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Db 516 SQQKSFVPPSETLHLESSQQKSTVP 540 

Search completed: January 7, 2004, 16:45:03 
Job time : 29 sees 
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GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen 



OM protein - protein search, using sw model 
Run on: 



January 7, 2 0 04, 16:44:17 ; Search time 21 Seconds 

(without alignments) 
1543.278 Million cell updates/sec 



Title : 

Perfect score: 
Sequence : 

Scoring table: 



US-10-088-872-2 

x'mKKMPLFSKSHKNPAEIVKI F ADEKN YL I KQ I RDLKKTAP 337 

BLOSUM62 

Gapop 10.0 , Gapext 0.5 



Searched: 283308 seqs r 96168682 residues 

Total number of hits satisfying chosen parameters: 283308 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

?ont -orocessing : Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



'Database 



PIR 76 : * 



1 : 


pirl : * 


2 : 


pir2 : * 


3 : 


pir3 : * 


4 : 


pir4 : - v 



P^-ed Mo is the number of results predicted by chance to have a 
Icore greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution . 

SUMMARIES 



It 




Query 








o . 


Score 


Match Length 


DB 


ID 


1 


13 76 


80 . 


8 


341 


2 


157997 


2 


1063 . 5 


52 . 


4 


377 


2 


T16651 


o 


1006 .-5 


59. 


1 


338 


2 


T27129 


4 


834 . 5 


49. 


0 


329 


2 


T50117 


5 


685 


40 


2 


305 


2 


G71441 


6 


632 


37 


1 


348 


9 


B84448 


7 


485 


28 


5 


399 


2 


S34681 


3 


143 . 5 


8 


.4 


339 


2 


T33477 


<) 


134 . 5 


7 


. 9 


677 


2 


H64574 


10 


128 


7 


.5 


430 


2 


H64709 


11 


125 - 5 


7 


. 4 


298 


2 


B71685 


12 


12 5.5 


7 


.4 


1642 


2 


T08880 


13 


123 . 5 


7 


.2 


1285 


2 


B72420 



Description 

hypothetical calci 
hypothetical prote 
hypothetical prote 
mo2 5 homolog [impo 
hypothetical prote 
hypothetical prote 
hypothetical prote 
hypothetical prote 
DNA topoisomerase 
hypothetical prote 
hypothetical prote 
NMDA receptor-bind 
hypothetical prote 



14 

15 

16 

17 

18 

19 

20 

21 

22 

23 

24 

2d 

26 

27 

28 

29 

30 

31 

32 

33 

34 

35 

3 6 
::.7 
:s 
:,9 

4 1 



120 
113 .5 
115 
113 . 5 
112 . 5 
111. 5 
111.5 
111 
111 
109-5 
109.5 
109. 5 
109 
109 
108.5 
103.5 
108.5 
108 
107 .5 
10 7 .5 
107 .5 
107 .5 
107 . 5 
107 
107 
107 
106 
10 6 
105.5 
105.5 
105.5 
105.5 



7.0 
7 . 0 
6.7 
6.7 



6 . 

6 . 

6 . 

6 

6 

6 

6 

6 

6 

6 



6 
5 
5 
5 
5 
.4 
. 4 
.4 
.4 
.4 



6.4 
6.4 
6.4 
6.3 
5 . 3 



. 3 
. 3 
.3 
.3 
.3 
. 3 
. 3 



6 

6 

6 

6 . 

6 , 
6 
6 
6.2 
6.2 
6.2 
6.2 
6.2 
6.2 



1175 
959 
474 
833 
1411 
725 
2401 
2166 
2819 
457 
978 
1830 
695 
1401 
442 
952 
1163 
568 
483 
855 
1042 
1726 
1726 
570 
1173 
1727 
4 74 
1295 
781 
847 
1091 
1619 



2 
2 

2 

1 

2 

2 

2 

2 

2 

2 

2 

2 

2 

2 

2 

2 

2- 

2 

2 

1 

2 

2 

2 

2 



F64489 

T00246 

S71322 

T43446 

S55123 

JC5016 

T28676 

G70163 

A90551 

C82911 

A70387 

E82909 

T07283 

SH527 

T18507 

T50451 

D64315 

S73254 

140055 

E90106 

G64514 

SAZQGM 

A45948 

S686 86 

T43527 

T50073 

S56748 

T24587 

T00456 

A56039 
T34107 
T184 99 



hypothetical prote 
DNA polymerase V - 
glutathione syntha 
hypothetical prote 
hypothetical prote 
hyaluronan recepto 
rhoptry protein - 
hypothetical prote 
conserved hypothet 
hypothetical prote 
conserved hypothet 
conserved hypothet 
hypothetical prote 
alpha- latrotoxin p 
hypothetical prote 
hypothetical coile 
type 1 restriction 
replication helica 
positive trans-act 
importin beta-1 SU 
type 1 restriction 
major merozoite su 
major merozoite su 
phosphoprotein pho 
sp8 protein - fiss 
myosin -like coiled 
glutathione syntha 
hypothetical^ prote 
protein kinase horn 
GTPase- activating 
hypothetical prote 
hypothetical- .prote 



ALIGNMENTS 



RESULT 1 

Hypothetical calcium-binding protein - mouse 

V^l^-M* 1 ^^-™^ oi-Aua-1996 #te*t__change 1 9 -May-2000 

C: Accession: 157997 ^ 

P. Miyamoto, H.; Matsushiro, A.; Nozaki, M. 

.^tirSlec^cioS of 9 a novel mRNA sequence expressed in cleavage stage 
tZfeTn7e°l^er.. 157997; MUID : 93119656; PM1D:8418809 

^™°^nlry; translated from GB/EMBL/DDBJ 

£ -Molecule type: mRNA 

A;R, S idues: 1-341 <* ES> NID: g262933; PIDN : AAB24 801 . 1 ; PID:g262934 

A; Cross-references :GB.S51853Nlug tein YK L189w 

C;Superfamily: Saccharomyces hypothetical pr 
C ; Keywords : calcium binding 



Query Match 



80.8%; Score 1376; DB 2; Length 341; 



Qy 

Db 

Qy 

Db 



Best Local Similarity 80 . 7% ; Indels 4; Gaps 

Matches 272; Conservative 32; Mismatcnes 

4 MPL - FSKSHKNPAE I VKILKDNLAI LEKQ DKKTDKASEEVSKSLQAMKE ILCGTNEK 59 

II I I I I I • II • I I I I I : : : I : I I = : I ! I : I I M U I Mill 

1 MPFPFGKSHKSPADIVKNLKES^VLEKQDI 60 

SO E PPTEAVAQLAQELYS SGLL VTL I ADLQL IDFEGKKD VTQ I "j" ~[ TIT^ ThmT t 

6i epqteavaqlaqelynsglijGTL^ 120 



- - l7 8 9 0 

121 CTQQNILFMLLKGYESPEIALNCG^ 180 



Db 
Qy 
Db 
QY 
Db 

Qy. 

Db 

V« ~ C 



,.8! iUiUUkiiUULLLF^HTORFFSEYEKI.LHSENYVTKEQSLKLlGELLLDE 240 
210 ^EalMTKYISKPENLKLMMNLL^ " 9 

-00 KL I EFL S S FQKERTDDEQF ADE KN YL I KQI RDLKKTA 336 

f I I I I 1 I II =11 = 1111 HI I I : I I I I : I I : I 
•--01 '''fjIEFLSKFQNDRTEDEQFNDEKTYliVKQIRNLKRAA 337 



IStheticai protein R02E12.2 - Caenorhabditis elegans • - :■ j ••• 

C; Species :• Caenorhabditis elegans _ ttext chanqe 18 -:Feb-2flOO 

C;Datc: 20 -Sep-1999 #sequence_ revision 20-Sep-1999 #text_c.nang 

C; Accession: T16651 

R;Leimbach, D. . 1Qac 

submitted to the EMBL Data Library April 1996 
A;Description: The sequence of C. elegans cosmid R02E12 . 
A Reference number: Z18554 

%SZXSZ*££*> «a„sl,ted £ r» GB/EMBL/DDBJ 
A; Molecule type: DNA 

i^t2 1 !re£e^lBfISL:U53337.. NID : gl25583.3 ; PID : 9 1255838 ; P1DN : AAA96187 . 1 ; 

GSPDB ■ GN00 02 8 ; CESP : R02E12 . 2 da _ 0 
^Experimental source: strain Bristol N2 ; clone R02E12 

C; (Genetics : ... 

A; Gene: CESP : R02E12 . 2 

A; Map position: X onI -/o 

P-ein YKL189W 

62.4%; Score 1063 5; DB 2; Length 377; 
Best' Local Similarity 60^5% ; ^^-64 ' 68 ; Indels X7; Gaps 2 
Matches 211; Conservative- 53, Mismatcnes 

4 MP - LFSKSHKNPAEI VKILKDNLAI IiEK QDKKTDKASEEVSKSLQAM 49 

C/ . I | | || | | | : | | : : I ! I - I I I = I ' " ' ' ' ' 



, MPLLFGKSKKSPADWKTLREVLTILDKLPPPKLDKDGNIQSDKKYDKALDEVSKNVAMI 60 
6 , Lpx.UipSsUQSvUrUUpKPEPCCKKBVOOI^ U . 

27 Q s™ MB H»P H M T K T i f p— n m • m 286 

m pkpiedii^^wrekIveflsefhndrtodeqfhdekaylikqiqemkss ,« 



Db 

QY 
Db 

Qy 

Db 

Qy 

Db 

Qy i i i i i i i l I • I I I M I I I I 1 II I : I I : I I 

Db 

Qy 

Db 

RESULT 



^thetlcal protain - Ca.narhabditis elegans 

'.(^LSrSSf^^^ «»"-< h «^ : 



C; Accession: T2 712 9 
R;5Cershaw f J- ; Lennard, N 



t o urss ^ i*«y. .«*—■ ^ 1597 : : 

ts - R.-^f ?rence number : Z20315 

E'^"!^™"^'''^'- translated £r°™ GB/EMBI./DDBJ ■ ' 

A; Molecule type: DNA 

A ; Residues: 1-338 <WIL> ■ PIDN : CAB164 86 . 1 ; GSPDB :GN00020 ; CESP = Y53C12A . 

A, Cross-references: EMBL . Z9927 / , 

A; Experimental source: clone Y.3C12A 



C; Genetics : 

A;Gene-. CESP : Y53C12A. 4 



A;Map. position: 2 „c/i. 282/3 

A;Introns: 29/3; 103/3; 136/2; 2^/1- 282/3 
C;Superfamily: Saccharomyces hypothetical pro 

59 1%; Score 1006.5; DB 2; Length 338; 
Query Match c 7 ' 2 %- Pred. No. 3.9e-60; 

s^uTES-^ •» Mis » atches ,s; Indels ' 

5 pLFSKSHKMPAEIVKILKDNLAILEK - -QDKKTDKftSEEVSKSLQAMKEILCGTMEK 5 , 
. lii G LiTlULipLiLviDRHG T »TSE R KVEK AI EETAKM IAI ,AKTFI- f G SD «. 63 

«. EPPTEaV^O-SSG^ » 
64 UJ Q iTiliiEvU M ™ip„iiF J ,iHKFElECKULsVF HB E,HRO IG TRSPTV E .E , 

J2 „ 3M .ph II ,f M ,ekg,e,p QI ae R gg I herecirhep,ak II ,f S » Q fROFfk™estf D i r 



Qy 

Db 

Qy 

Db 

Qy 



Db 1M ^lel^iUUoloiiULiULviiLti.lvtiLv^iUosovUi 3,3 

Db 304 rivEFLTAFHNDRTNDEQFNDEKAYLIKQIQELR 337 

RESULT 4 

^"colo., [ported, - fission yeast ( sc„izosa=oh.ro«,yoe S pontoe, 
^^-SSSS 6 ^^^ 03—00 ,te«_cha„ 9 . ,-i«i 3„. 
Concession: T50317 Rajandre.n,. H.A.; Barrel!, 3.G. 

Si to rSSL oaca Library. February 2000 
^/Reference number: Z25039 

A; Accession : "Oil? f r0 m GB/EMBL/DDBJ 

A; Status: prelinunaiy; transiauea 

:. r M ?.l ecule type : DNA 

.:::^^ef,re„cL: S ™i... M .l 5 7,34,. ,n»,e«7»774.1, GSPDB^OOOSS ; . - ; •. 

^"iilource: strain 30 2 b,-,; cos^id =303, ■ 

C; Genetics: , ' 1 .' 

A; Gene: SPDB : SFAC1B34 . 06c ; 
A; Map position : 1 

cSSa^y-'saccnaUces bypotbeticaf protein VKL380. 

CoeryMateb ^ V,^ ^ , 

2^^^.^ 03,. tndeis 3; Gaps 3, 

333 1™™^"^ 
^ 303 ^jLLol^ ^ 



I 



Db 

Qy 
Db 



RESULT 5 
G71441 



242 Atryissaenlklmmillrdkskniqfeafhvfklfvanpekseevieilrrnksklisy 

30-, LSSFQKERTDDEQFADEKNYLIKQIRDL 332 

"||:| :| =1111 11= -MM I 
3 02 LSAFHTDRKNDEQFNDERAFVIKQIERL 32 9 



301 



f' 7 ' 4 4 -L • 

hypothetical protein - Arabidopsis 
C Species: Arabidopsis thalrana (mouse-ear cress) 

C^ll^M #se q uence_revision 03-Aug-1998 #te*t_change 18-Aug-2000 

C ';Accession: G71441 _ ^ _ Goodman, H.; Dean, C . ; Bergkamp, 

R.-Bevatf, M. ; Bancroft, I.; Bent, t... . Ridley, P.; Hudson, 

r. ; Dirkoe, W.; Van Staveren, M Strek ema -I h . . 

S.A.; Patel. K. ; ^phy, O. . Pjf « anellr j. .. villarroel. R-; De 

r.j Weitzenegger, T.; Pohl , T.M., lerryn, Kreis, M . ; Lao, 
clerck , r. ; van Montagu * • Lecharny, A ; Auborg, S.. y. ;> ^ 
N.v Kavanagh, T . ; Hempel , S., Kotter, f - , 
M . ; Funk , B . 

Nature 3 91, 4 35-4 88, 199 8 . Mont fort, A.; Pons, A. ; 

A. Author,: Mueller-Auer, S. ; ^ey J-es ^, d . ; Hatzopoulos , P - ; 

Puigdomenech, P DouKa A_, Voukelat ou ^ A. ; Moore s , T - ; Jones, 

Piravandi, E. ; Obermarer, B. t r s ^ w.; Cooke, -R ; 

^:; r ; ri^seny^rvoet'M" volckaert, Mewes, H.W. ; Klosterman, „. s 

Mtle: r Analys™ Z 9V6f contiguous sequence from chromosome 4 of . 
Terence ^"^71400; MUID:98121113; PMIP:9461215 

A ;3tatu;: 0 pre!Iminary; nucleic acid sequence not shown; translation not- shown 
A; Molecule type: DNA 

A;Residues: -^05 <BEV » >jid :g2245073 ; PID : e327 05 1 ; PID:g2245086 

A : Cross - references : GB:/.y sis, -''■>■" a 



C; Genetics : 



A; Map position: 4COP9-4G3845 Dro ^ e in YKL189W 

C-Superfamily: Saccharomyces hypothetical pro-.ein ^ 

M tch 40.2%; Score 685; DB 2; Length 305; 



Qy 

Db 

Qy 
Db 



41 EVSKSLQAMKEILCGTNEKEPPTEAVAQ^QELYSSGIiLVTLIADtQI'IDFEGKroVTQI 100 

0 bLSKSIPELKLiIyGNsL^PVASACAQLTQ^ " 
101 FNHILRRQIGTRSPTVEYISAHPHILPHIiLKGYEAPQIALRCGIMLRECIRHEPLAKI 'ili 160 
6S VAMLQRQQ^SEI'IAADYLESNIDLMDFLVDGFENTDMALHYGTMFRECIRHQIVAKYVL 12, 

isi fs^frofekyve^iaecaf™™— 



1.1 F f QFRO„, ; , :r — •■ ,„ ,,,,,,,, 11-11 .1 I 



Db 

Qy 

Db 

Qy 

Db 



128 DSEHVKKFFYYIQLPNFDIAADAAATFKELLTRHKSTVAEFLIKNEDWFFADYNSKLLES 187 
220 ENYVTKRQSLKLLGELILDRHNFAIMTKYISKPENLKLMM^ 

188 TiYllRiQAliiiiDILiDRsLUiiYVSSMDNLRILMNLLRESSKTIQIEAFHVFKL 247 
280 FVASPHKTQPIVEILLKNQPKLIEFLSSFQKERTDDEQFADEI^YLIKQIRDLK 333 
248 F VANQNKP SD IAN I L VANRNKLLRLL AD I KPDK - EDERFD ADKAQWRE I ANLK 300 



RESULT 6 

hJpotLtieaf protein M 2,03410 (imported, - Arabidopsis thaliana 

C c!^-^JU^^S 

Fujii, C.T., "asoa, T.M., f ™ an ^ n L B ^ St "o;, Moffat. K.S.; Cronfn, 

C.H.; Ketclvoro, K.A.; Lee, J. J . ; Kpnning, C '" a '. ±n JE . adaalB , 

r.M. ; Venter, J.C. 

AtTSet^S^eicr'aAd^nalysis of chrome 2 of the plant Arabidopsis 



A;Reference number: A84420; MUID : 20083487 ; PM1D : 10617197 
A/Accession: B84448 

A;! Status :■' -preliminary ...... 

A;Molecule type: DNA 

^Sirr:feienc: S : S GB:A E p020 93; NXD: g 4335758; PIDN: AAD17435 . 1; GSPDB : GN0013 9 

C;Genetics: 

A; Gene: At2q0 3 410 

^SupeS^iiy: Saccharoses hypothetical protein YKL189w ■ ■ 

Match 37.1%.; Score 632; DB 2; Length 348; 

Best Local Similarity 38.7%; Pred No 4-4e-35; _ 
Matches 133; Conservative 80; Mismatches 117; Indels 14, Gap 

0V 6 LFSKS.HKNPAEIVKILKDNLAILEKQDKKTDKASE EVSKSLQAMKEILCGTNE 58 

Db 4 LFKNKSRLPGEIWQTRDLIAL^ 63 

Qy S9 KEPPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGK^ 118 

64 A ip V pUcLLlTQiFFRAD T iRpiiKsipKLDLEARKDATQIVANLQKQQVEFRLVASEY 12" 



3 



119 ISAHPHILFMLLKGYEAP-QIALRCGIMLRECIRHEPLAKIILFSNQFRD 177 
124 LESNLDVIDsivEGIDHDHELiiHYTGMLKECVRHQVVAKYILESKNLEKFFDYVQLPYF 183 

; 17S rf r ~— 236 

Db 184 DVATDASKIFRELLTRH 243 



nv 237 LDRHNFAIMTKY I S KPENLKLMMNLLRDKS PNIQFEAFHVFKVFVAS PHKTQP I VE I LLK 296 

Y • | | | :| | |:| :||::|||||h : III lllhlhllh H : II lh 

Db 244 MDRSNSGVMVKYVSSLDNLRIMMNLLREPTKNIQLEAFHIFKLFVANENKPEDIVAILVA 303 

0v 2 97 NQPKLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLK KTA 336 

|: |:: = : |= :| | =| =: =1 I III 
Db 304 NRTKILRLFADLKPEK-EDVGFETDKALVMNEIATLSLLDIKTA 34 6 

RESULT 7 
S34681 

hypothetical protein YKL189w - yeast (Saccharomyces cerevisiae) 
C; Species: Saccharomyces cerevisiae 

C;Date: 30-Sep-1993 #sequence_revision 30--Sep-1993 #text_change 19-Apr-2002 
C; Accession: S34681; S33963; S38021; S38026 

R;Wiema n/ S.; Voss, H . ; Schwagaer, C; Rupp, T.; Stegemann, J.; Zimmermann, J. 
Grothues, D.; Sensen, C; Erfle, H. ; Hewitt, N. ; Banrevi , A.; Ansorge, W. 
submitted to the EMBL Data Library, July 1993 

A;Description: Sequencing and analysis of 51.5 kilobases on the left arm of 

chromosome XI from Saccharomyces cerevisiae reveals 23 open reading frames 

including the FAS1 gene. 

A; Reference number: S34679 

A/Accession: S34681 

A;Molecule type: DNA 

A/Residues: 1-399 <WIE> 

A; Cross-references: EMBL:X74151; NID:g450365; PIDN : CAA5224 9 . 1 ; PID:g395236 
A; Experimental source: strain S288C 

R;Cheret, G . ; Mattheakis, L.C.; Sor, F. - 
Yeast 9, 661-667, 1993 

A/Title : DNA sequence analysis of the YCN2 region of chromosome XI ;in 
Saccharomyces cerevisiae. 

A; Reference number: S33960; MUID : 93348778 ; PMID:8394042 
A/Accession: S33963 
A;Molecule type: DNA 
A/Residues : 1-399 <CHE> 

A;Cross-references: GB-.X69765; NID:g296985; PIDN : CAA49422 . 1 ; PID:g296989 
R-Wiemann, S.; Voss, H.; Schwager, C; Rupp, T . ; Grothues, p.; Sensen, C; 
Stegemann, J . ; Zimmermann, J.; Erfle, H. ; Hewitt, N. ; Ansorge, W. 
submitted to the Protein Sequence Database, March 1994 
A;Reference number: S37825 
A; Accession: S38021 
A /Molecule type: DNA 
A/Residues: 1-399 <WI2> 

A;Cross-references: EMBL:Z28189; NID:g486334; PIDN = CAA82032 . 1 ; PID:g486335; 
MIPS : YKL189W 

A; Experimental source: strain S288C 

R;Maia e Silva, A.; Bossier, P.; Vilela, C; Fernandes, L . ; Soares, H.; 
Guerreiro, P.; Rodrigues-Pousada, C. 

submitted to the Protein Sequence Database, March 1994 
A; Reference number: S3 8024 
A; Accession: S3 802 6 
A/Molecule type: DNA 
A;Residues: 1-399 <MAI> 

A;CrosS-referenceS: EMBL:Z28189; NID:g486334; PIDN : CAA82 03 2 . 1 ; PID:g486335; 
MIPS : YKL18 9W 

A; Experimental source: strain S2 8 8C 



C;Genetics : 

A; Gene: SGDrHYMl 

A; Cross-references : SGD :S0001672 
A; Map position: 11L 

C;Superfamlly: Saccharomyces hypothetical protein YKL189w 

Query Match 2 8.5%; Score 4 85; DB 2; Length 3 99; 

Best Local Similarity 33.0%; Pred. No. 3.6e-25; 

Matches 113; Conservative 75; Mismatches 138; Indels 16; Gaps 



Ov 


7 


FSKSHKNPAEIVKILKDNLAILEK QDKKTDKASEEVSKSLQAMKEILCGTNEKEPP 


62 


Db 


16 


: |: | |:: : = : : | Ml 1 II =1 1 1=1 = 1 
WKKNPKTFSDYARLIIEQLNKFSSPSLTQDNKR-KVQEECTKYLIGTKHFIVGDTDPHPT 


74 


Qy 


63 


mp.TTAnT ADPT Y^OLLVTLIADLOLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYISAH 


122 


Db 


75 


||: :| :: : : |: ■ ::|| = = : Ih 1 = ihh = 
PEAIDELYTAMHRADVFYELLLHFVDLEFEARRECMLIFSICLGYSKDNKFVTVDYLVSQ 


134 


Qy 


123 


^HILFMLLKGYE - - - APQ I ALRCG I MLREC I RHE PL AKI I LFSNQFRD F F KY VEL S 


175. 


Db 


135 


V : =:|= | 1 1 1 h MhM 1 MM 1 II- : 1 
PKTISLMLRTAEVALQQKGCQDIFLTVGNMI IECIKYEQLCRIILKDPQLWKFFEFAKLG 


194- 


Qy 


176 


TFDIASDAFATFKDLLTRHKVLVA-DFL- -EQNYDTIFEDYEKLLQSENYVTKRQSLKLL 


232 


Db 


IIS 


|:|:::: II II" = 1 II = Ih MINIM III 
NFKISTESLQILSAAFTAHPKLVSKEFFSNEINIIRFIKCINKLMAHGSYVTKRQSTKLL 


2 54. 


Qy 


233 


GEL ILDRHNFAIMTKYI S KPENLKLMMNLLRDKS PN IQFEAFHVFKVFVAS PHKTQP I VE 


2 92 


Db 


2 55 


Ih 1 1 hi Ih - III [1 hi h Ml hi II hi III Ih! h = h = 

ASLIVIRSNNALMNIYINSPENLKLIMTLMTDKSKNLQLEAFNVFKVMVANPRKSKPVFD 


314 


Qy 


2 93 


[TjLKNQPKL I EFLS S FQKERTDDEQFADEKNYL I KQ IRDLKK 334 




Db 


315 


||:Jh Ih : : 1 : : 1 — 1 i : 
ILVKNRDKLLTYFKTFGLD-SQDSTFLDEREFIVQEIDSLPR 3 55 





RESULT a 
T33477 

hypothetical protein T27C10.3 - Caenorhabditis elegans 
C; Species: Caenorhabditis elegans 

■C;Date: 29-Oct-1999 #sequence_revision 29-Oct-1999 #text_change 29-.Oct-1999 
C; Accession: T3 3477 

R;Zhu, H.J.; Graves, T. ; Hawkins, M . 

submitted to the EMBL Data Library, October 1998 

A; Description: The sequence of C. elegans cosmid T2 7C10. 

A/Reference number: Z21354 

A; Accession : T 3 3 4 7 7 

A; Status : preliminary; translated from GB/EMBL/DDBJ 
A;Molecule type: DNA 
A;Residues: 1-339 <ZHU> 

A; Cross-references: EMBL : AF098504 ; PIDN : AAC67411 . 1 ; GSPDB : GN00019 ; CESP:T27C10. 

A; Experimental source: strain Bristol N2 ; clone T27C10 

C ; Genetics : 

A;Gene: . CESP : T27C10 . 3 

A; Map position: 1 

A; Introm*: 72/3; 120/3; 233/3; 295/1 



Query Match 



8.4%; Score 143.5; DB 2; Length 339; 



Best Local Similarity 19.3%; Pred. No. 0.02; 

Matches 38; Conservative 50; Mismatches 76; Indels 33; Gaps 4; 

Qy 159 ILFSNQFRDFFKYVELSTFDIASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQ 218 

:: :|:||| II = II— : :| = : I : lh 

D b 100 LMNTNKFRD FDVIQGTFDTLQI I FFTNHES ANNF I KNNLPRFMQTLHKLIA 150 

Qy 219 S ENYVTKRQS LKLLGEL ILDRHNFAIMTKYI S KPENLKLMMNLLRDKS PNI QFE AFHVFK 278 

|= : :| I Ml = h = : I h • == := I 

Db 151 CSNFFIQAKSFKFLNELFTAQTNYETRSLWMAEPAFIKLWLAIQSNKHAVRSRAVSILE 210 

Qy 279 VFVASPHKTQPIVEILLKNQPKLIEFL SSFQKERTDDEQFAD 320 

:|: :| : : I : :h II I I - I I I hi 

Db 211 I F IRNPRNS PE VHEF IGRNRNVL I AFFFNS AP IHYYQGS PNEKE : DAQ YARMAYKLLN 267 

Qy 321 E KNY L I KQ I RDLKK 334. 

Db 268 WDMQRPFTQEQLQDFEE 2 84 



RESULT 9 
H64574 

DNA topoisomerase I - Helicobacter pylori (strain 26695) . 
C; Species : Helicobacter pylori 

C;Date: 09-Aug-1997 \\ sequenee_revision 09-Aug-1997 fttext_change 08-Oct-1999. 
C;Accession: H64574 

?.;Tomb, J.F.; White, O.; Ker lavage , A. R. ; Clayton, R.A. ; Sutton, G.G. ; 
j'leischmann, R.D.; Ketchum, K.A. ; Klenk, H.P.; Gill, S.; Dougherty ,\ B . A. ; . 
Nelson, K. ; Quackenbush, J. ; Zhou, U. ; Kirkness, E.F.; Peterson, S . ; Loftus, B . ; 
Richardson, D.; Dodson, R. ; Khalak, H.G. ; Glodek, A.; McKenney, K. ; Fitzegerald, 
L.M.; Lee, N. ; Adams, M . D . ; Hickey / E . K. ; Berg, D.E.; Gocayne, J.D.; Utterback, 
T.R.; Peterson, J.D. ; Kelley, J.M.; Cotton, M.D.; Weidman, J.M.; Fujii, C . ; 
Bowman, C. ; Wat they, L. 
Nature 388, 539-547, 1997 

A;Authors: Wallin, E. ; Hayes, W.S.; Borodovsky, M . ; Karpk, P.D.; Smith, H.O.; 
Fraser, CM. ; Venter, J.C. 

A; Title: The complete genome sequence of the gastric pathogen Helicobacter 
pylori . 

A; Reference number: A64520; MUID : 973 94467 ; PMID:9252185 
A; Accession: H64 574 

A; Status: preliminary; nucleic acid sequence not shown; translation not shown 
A; Molecule type : DNA 

A; Residues : 1-677 <TOM> - - 

A; Cross-references : GB:AE000559; GB:AE000511; NID :g2313536 ; PIDN : AAD07502 . 1 ; 
P.ID:g2313542; TIGR:HP0440 
C; Super family : DNA topoisomerase 1 

Query Match 7.9%; Score 134.5; DB 2; Length 677; 

Best Local Similarity 21.6%; Pred. No. 0.19; 

Matches 88; Conservative 58; Mismatches 134; Indels 127; Gaps 16;. 

Qy 7 FSKSHKNPA-EIVKILKDNL AILEKQDKK TDKASEEVSKSLQAMKE 51 

| || i i : :| III I I = II ! I - 1111 = 
Db 222 FKFKDKNEASQFLKDLKDGLGSMSVLVSLKESLSNKEPKKPFTTSKLLSQASKSLKI 278 

Qy 52 ILCGTNEKEPPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGT 111 

||: =11111 = 1= =lh | : : | h II 



Db 779 PTKEIAQLAQKLFEAGLITYHRTDSEFLSPEYLKEHEVFFEPIY 322 

n R c:p T v EYIS AHPHILFMLLKGYEAPQIALRCGIMLRECIRHE 153 

QY " "~\:\ || : III I I I =1= =1 = 1 

Db 323 --psVYQYREYKAGKNSQAEAHEAIRITHPHALKDLEKVCSDAKISEELALKLYQLIYTN 380 

nv i 54 p L _.- AK IILFSNQFRDFFKYVELSTFDIASDAFATFKDLLTRHKVLVADFLEQNYDTIF 210 

7 " : :: |: ||: M I I = = I I I 1=11=1 

Db 361 TICSQSRNALY-NQYDCIFK IKSESFKLSFKLLKEKGFLEIEELIQGKEEIN 431 

Ov 211 EDYEKLLQSENYVTKRQSLKLLGELILDRHNFAIMTKYISKPENLKLMMNLLRDKSPNIQ 270 

: I : : | | : I 11= = = =1 I I 5 I I 8 
Dfc 432 RE-EQESEIENFSLKENDSVPLKEVFIKK IEKPSPKPYKESAFIPLLESEG 481 



QY 



27- FE AFHVFKVFVAS PHKTQP I VE ILLKNQ PKLIEFLSSFQKERTDD- 315 

: | :::||| : :. =| =1 l=t = = I 

Db 482 iGRPSTYASFLDLLLKRKYISIDTKTNAITPTSQGLEVISFFKKDKEVDF 531 



QY 



-, 16 5QF ADEKNYLIKOIRDLKKTA 336 

I = = = = = I II II 
Db I ALTS KDKS KLGNTTKQFEECLDL IMRGEAS YEKFMLE VI S KLKSTA 578 



RESULT 10 
H64703 

hypothetical protein HP1520 - Helicobacter pylori (strain 26695) . • 
C;SDecies: Helicobacter pylori 

C;Date: 09-Aug 1997 #sequence_revision 09-Aug-1997 #text_change 08-Oct.-.1999 
C-Accession: H64709 

■»-To™b J F • White, O.; Kerlavage, A. R. ; Clayton, R.A. ; Sutton, G .G . ; • 
F ,y a ' r > chmann , . R . D - ; Ketchum, K.A. ; Klenk, H.P.; Gill, S.; Dougherty , B . A . ; 
Mellon, K. ; Quackenbush, J. ; Zhou, L.; Kirkness, E.F.; Peterson, S.; Loftus, B ; 
R-i chardson, D.; Dodson, R. ; Khalak, H.G.; Glodek, A.; McKenney, K. ; Fitzegerald, 
T, w - Lee, N. ; Adams, M.D.; Hickey, E.K.; Berg, D.E.; Gccayne, J.D.; -Utterback, 
T.R.; Peterson, J. D.; Xelley, J.M.; Cotton/ M.D.; Weidman, J.M.; Fujii, C; 
Bcwnian, C . ; Watthey, L. 

Nature 388, 539-547, 1997 ' 

A; Authors: Wallin, E . ; Kayes , W.S.; Borodovsky, M, ; Karpk, P.D.; Smith, H.O.; 

Fraser, CM. ; Venter, J.C. t 

A; Title: The complete genome sequence of the gastric pathogen Helicobacter 

pylori . 

A/Reference number: A64520; MUID : 97394467 ; PMID-.9252185 

A; Accession : H64709 , 

A; Status : preliminary; nucleic acid sequence not shown; translation not shown 

A; Molecule type: DNA 

A; Residues: 1-43 0 <TOM> , 
A; Cross-references: GB:AE000650; GB:AE000511; NID : g2314700 ; PIDN : AAD08565 . 1 ; 

PID:g2314705; TIGR:HP1520 

C;Superfamily: Helicobacter pylori hypothetical protein HP1520 

Query Match 7.5%; Score 128; DB 2; Length 430; 

Best' Local Similarity 20.9%; Pred. No. 0.29; 

Matches 82; Conservative 73; Mismatches 135; Indels 102; Gaps 20; 

Q 7 FSKSHKHPAEI VKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPP 62 

| : | : || Mlhhh h : :(h I - h 
Db 60 FYPNRKSKIEIEFNGEKILKENVAVFHSYDE - - EFSSEDSVTTFMAKSDL KQQY 111 



Ov 6'. T3AVAQLAQELYSSGLLVTL--IA DLQLIDFEGKKDVTQIFNNILR 106 

" : :| :| l| :| || ::: I I I M =1 I 

Db 112. DNILLELEKE- - KKALLKSLRDIASGFDYEEEIKTIKNEKNKSFYEILDNHLTEIESSEK 169 

q 107 rqigtRSPTV-EYISAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKII 159 

Db 17 0 HYS FKYRD I FDGSKKVKDFVNKHHDLI EQYFNKYQ ELLSQSK 211 

nv 160 LF SNQFRDFFKYVELSTFDIASDAFATFKDLLTRHKVLVADFLEQ 204 

Db 212 IFKHMNSGDFGTNHADDLKKALENNRFFKANHSLKI AGEE ITNYQKL - SDI FENEKNRIL 270 

0v 2o r . NYDTIFEDYEKTjLQSENYVTKRQSLKLLGELI LDRHNF- -AIMTKYISKP 252 

i : : | ::|: | : ■ : || ■= I II 'I - .1= ' 

Db 271 NNEELKESFDKI---EKVINANKELKAFKDAISKDNTLLTEFLDYDSFRKKVLFSYLKQV 327 

0v 253 -ENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQPKLIEFLSSFQKE 311 

:|:| ::|| |:| | h . = I ^ -H U h I I • 

Db 328 I ONVKS LVNL YREKKPE I E E 1 1 KQAS KDQKEWE S VI E I F - - NQRFLVPFKVELQNQ 381 



312 R TDDEQ FADEKNYLIKQIRDLKK 334 

II hl= I 11 = 1 

b 3 02 KDILLNKDAAQFRFIFSDDNQDMNVQKEDLQK 413 



RESULT 11 

hvpothetictil protein RP295 - Rickettsia prowazekii 
?i Species: Rickettsia prowazekii 

C ;Date: 21 -Nov- 1998 #sequence_revision 2 1 -Nov - 1'9 9 8 #text__change 03 -Nov -2000 
C; Accession- B71685 

R;Andersson, S.G.E.; Zomorodipour , A.; Andersson, J.O.; Sicheritz- Ponten, T. ; 
Alsmark, U.C.M., Podowski, R.M.; Naeslund, A.K.; Eriksson, A. S . ; Winkler, H.H.; 
Kurland, C.G. 
Nature 396/133-140, 1998 

A; Title: The genome sequence of Rickettsia prowazekii and the origin of 
mitochondria . 

A; Reference number: A71630; MUID : 99039499 ; PMID : 9823 893 
A; Accession: B71685 

A; Status: preliminary; nucleic acid sequence not shown; translation not shown 
A; Molecule type: DNA 
ArResidues: 1-298 <AND> 

A; Cross-references: GB : AJ235271; GB:AJ235269; NID : g3 868717 ; PIDN : CAA14756 . 1 ; 

PID:g3360856; GSP.DB : GN0 00 81 

A; Experimental source: strain Madrid E 

C; Gene tics: 

A; Gene: RP2 95 

Query Match 7.4%; Score 125.5;, DB 2; Length 2 98; 

Best Local Similarity 20.1%; Pred. No. 0.27; 

Matches 62; Conservative 57; Mismatches 114; Indels 75; Gaps 13; 

0v 73 LYSSGLLVTLIADLQLIDFEGKKDVTQ IFNNILRRQIGTRS 113 

|= |:| | : = = |: = : i I : = I = I 

Db 6 LFIQLLIVTSLVKAEI IEVDSLNKITQDFKVNYNKNYLPQDLLVVTVLDKFLFKSFGV- - 63 



11A pTVEYISAHPHILFMLLiKGY- -EAPQIALRCGIMLRECIRHEPLAKIILFSNQFR 166 

| | | | I s : | : : h I : : : | | : | = : 

Db 64 PIGEYIDQHRYLALAPLFSHINKNPKIIY ITQLILTNNSYKKELQE 109 

Ov 167 -DFFKYV-ELSTFDI ASDAFATFKDLLTRHKVLVADFIjEQNYDTIFEDYEKLLQSE 220 

II :| |:j | :: I : : : == \\ \ '■ \ ■ : I I 

Db 110 SDFPNFWEMSNSQIPIIAVNNGFTGNFNNIPKFEIWFADYLKKNF YIDFSKSFPNN 166 

Ov 221 NYVTKRQSLKLLGELILDRHNFAIMTKYISKPENL KLMMNLLRDKSPNIQFEAFHVF 277 

Y II: : | : : : I I 1= I — I I I I 

Db 167 jjyI IFNNLDSFDNTYPVFYKGILTSNNIPASKVILNFL IQINFIPKC 213 

0v 278 KVFVASPHKTQP I VE ILLKNQPKLI EFLSS F - - QKERTDDEQFADEKNY LIKQI 329 

: ::| : :| | | i h I HI' ' I I H H 

Bb 214 FILISSSRELLRSMEFQLNNYSSNILFIGYHYNNKSISDDKDYKDIAYYTKMINDLIPQI 273 

Qy 330 RDLKKTAP 33 7 

11= I 

Db 274 NKLKRNNP 281 

RESULT 12 
T08880 

NMDA receptor -binding protein . yotiao - human 

C; Species: Homo sapiens (man) _ 
•-Date: ll-Jun-1999 #sequence_revision ll-Jun-1999 #text_change 21-Jul-2000 

C. Accession: T08880 ' 
?.;L-in, J.W.; Wyszynski, M. ; Madhavan, R. ; Seaiock, R . ; Kim, J.U.; Sheng, M . 
■J. Meurosci. 18, 2017-2027, 1998 

A- Title- Yotiao, a novel protein of neuromuscular junction and brain that 
interacts with specific splice variants of NMDA receptor subunit NR1 . 
A/Reference number: Z16511; MUID : 98151389 ; PMID: 9482789 
A;Accession: T08880 

A; Status : preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: mRNA 
P .Residues : 1-1642 <LIN> 

A; Cross-references : EMBL : AF026245 ; NID:g2623067; PIDN : AAB863 84 . 1 ; PID:g2623068 
C; Genetics: 

A; Map position: 7q21-22 

C; Keywords: brain; cerebral cortex; coiled coil; neuromuscular junction; 
skeletal muscle 

Query Match 7.4%; Score 125.5; DB 2; Length 1642; 

Best Local Similarity 20.2%; Pred . No. 2.4; 

Matches 77; Conservative 73; Mismatches 117; Indels 115; Gaps 15 
0v 18 VKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPTEAVAQLAQELYSSG 77 

:: UNI | || = I 1 = 1= == 1= 11= II I 
Db 6 64 IEKLKDNLGIHYKQ- -QIDGLQNEMSQKIETMQ FEKDNLITKQNQLILE 710 

n,, 7 8 LLVTL1ADLQ- -LIDFEGKKDVTQIFNNILRRQI GTRSPTVEYISAHPHI 125 

:: : i|| |:: : :: | | | | : : : | || h = = 

Db 711 - - ISKLKDLQQSLVNSKSEEMTLQI - -NELQKEIEILRQEEKEKGTLEQEVQELQLKTEL 766 

Qv 176 LFMLLKGYEAPQIALRCGIMLREC1RHEPLAKIILFSNQFRDFFKYVELSTFDIASDAFA 185 

| :| | I : : I : I II 

Db 767 LEKOMKEKE NDLQEKFAQLEAEN-SILKDEKK 797 



Qy 


136 


Db 


798 


Qy 


233 


Db 


358 


Qy 


265 


Db 


918 


Qy 


321 


Db 


975 



± r JVUJ-iJu x r^.n aw j_i vr^j. - - ~ 

! : |:| | ::: | :: :: |:::| -I II hi -I 1 = 

TLEDMLKIHTPVSQEERLIFLDSIKSKSKDSVWEKEIEIIilEENEDLKQQCIQLNEEIEK 857 

233 DRHNFAIMTK YISKPENLKLMMNLLRD 264 

1:1 I I I = I : : I ! 



: I I III =1 

3 S VFDEDKTFVA ETLI 

-EKNYLIKQIRDLK 33 3 
i : = | = : II 



RESULT 13 
B72 4 2 0 

hypothetical protein TM0088 - Thermotoga maritima (strain MSB8) 
C; Species: Thermotoga maritima 

C;Date: ll-Jun-1999 #sequence_revision ll-Jun-1999 #text_change 21-Jul-2000 
C;Accession: B72420 

p.jjPlson, K.B.; Clayton, R.A.; Gill, S.R.; Gwinn, M.L.; Dodson, R.J,; Haft, 
n'n - Hickey, E.K.; Peterson, J.D.; Nelson, W.C. ; Ket churn, K.A. ; McDonald, L. ; 
ITi-terback, T.R.; Malek, J. A. ; Linher, K.D.; Garrett, M.M.; Stewart, A.M. ; 
-or ton-, M.D.; Pratt, M.S.; Phillips, C.A.; Richardson, D . ; Heidelberg, J- ; 
Sutton! 6.G. ; Fie ischmann, R.D.; White, O. ; Salzberg, S.L.; Smith, H.O.; Venter 
O'.C; Fraser, CM. . ' . ' 

Nature 3 99, 323-329, 1999 . 

A;Title: Evidence for lateral gene transfer between Archaea and Bacteria from 
genome sequence of Thermotoga maritima. 

A; Reference number: A72200; MUID: 99287316; PMID: 10360571 
A; Accession: 372420 
A; Status : preliminary 
A; Molecule type: DNA 

A: Residues: 1-1285 <ARN> . ' n 

A: Cross-references : GB:AE001695; GB:AE000512; N1D : g4 98 05 6 9 ; PIDN : AAD3.D 1 fa 2 . 1 ; 

PID:g4980577; TIGR:TM0 0 88 

A; Experimental source: strain MSB 8 

C; Genetics : 

A;Gene: TM0088 

Query Match 7.2%; Score 123.5; DB 2; Length 1285; 

Best Local Similarity 21.5%; Pred. No . 2 . 4 ; 

Matches 36; Conservative 78; Mismatches 129; Indels 107; Gaps 23; 

Ov 1 MKKMPLFSKSHKNPAEIVKILKDNLAILEKQD KKT DKASEEV SKS 45 

- Y : | | I - I \: : \ \: Ml III I I I 

Db 556 LKVAMLSGKEEEN VQKAAEELQ 1 1 S SEERI IRFVKKTENVPIDKAKNWLQLYSVS 611 

Ov 46 LQAMKE I LCGTNEKE P PTE AVAQLAQEL YS SGL LVTLIAD- - 85 

- : I hi I I I I = = = I I ^ S = „ n 

Db 612 IEELGNELWIGERE - EVEKAADLLQKI FS SEVEI SRDFVKLPSWIDEQEKLLE WKNSA 670 

Qy S6 - - -LQIiID FEGKKD VTQIFNNILRRQIG -TRSPTVEYI - - - SAHPHILFML 129 



Db 


671 


Qy 


130 


Db 


730 


Qy 


177 


Db 


.781 


Qy 


2 34 


Db 


837 


Qy 


2 34 


Db 


893 



:| ||! |: ::|::|: : :| : |||:: h 



IECIRHEPLAKI IL FSNQFRDFF KYVELST 176 

I : | ::| h : ! II I I- 

- --CFSLDQLGLLVLKGSSEAVEDLSSMYRSFFERHQKIVKENV 780 



M : : :|:: | :||| : : :: | || | I- = ■ I I 



234 ELILDRHNFAIMTKYIS - - - - KPENL -KLMMNLLRDKSPNIQFEAF -HVFKVFVAS 283 

| : I : h I : III h' I - == I 

■FLKKEEAVSEKKAVKSVTIPSGVNPDELSSYLKKLLR NVEITVFPNMGQMIVEG 892 



| =: = ||:: : I- III 



RESULT 14 
F64489 

hypothetical protein MJ1519 - Me thano coccus jannaschii 
C; Species: Methanococcus jannaschii 

C;Date: 13 -Sep-1996 #sequence_revision 13-Sep-1996 #text_change 21-Jul-2000 
C; Accession: F64489 

R;Bult, C'.J-r White, O.; Olsen, G.J. ; .Zhou, L. ; Fleischmann, R.D.; Sutton, G.G 
Blake, J. A.; FitzGerald, L.M.; Clayton, R.A.; Gocayne, J.D.; Kerlavage, A.R.,; 
Dougherty, B.A.; Tomb, J.F.; Adams, M.D.; Reich, C.I.;- Overbeek, R. ; Kirkness, 
S.F-; Weinstock, K.G. ; Merrick, J.M.; Glodek, A. ; Scott , J.L. ; Geoghagen, 
N.G.M.v Weidman, J.F.; Fuhrmann, J.L.; Nguyen, D. ; Utterback, T . R . Kelley , 
J.M. ; Peterson, . J. D. Sadow, P.W.; Hanna, M.C; Cotton, M.D.; Roberts, K.M.; 
Hurst, M.A. ?• *<-'■■ 

Science 273, 1058-1073, 1996 

V Authors : Kaine, B.P.; Borodovsky, M. ; Klenk, H.P.; Fraser, CM.; /Smith, .H.O. 
Woese, C.R.; Venter, J.C. 

A, Title: Complete genome sequence of the methanogenic archaeon, Methanococcus. 
jannaschii. 

A; Reference number: A643 00; MUID : 9633 7999 ; PMID: 8688037 t 
A/Accession: F64489 

A-; Status: preliminary; nucleic acid sequence not shown; translation not shown 

A;Molecule type: DNA 

A; Residues: 1-1175 <BUL> 

A; Cross-references: GB:U67593; GB:L77117; NID : g2 826427 ; PIDN : AAB995 3 8 . 1 ; 
PID : q 1 5 0 0 40 9 ; TIGR : M Jl 5 1 9 
C; Genetics : 

A; Map position: FOR1494 096 - 1497623 

Query Match- 7.0%; Score 12 0; DB 2; Length 1175; 

Best Local Similarity 21.5%; Pred. No. 3.6; 

Matches 76; Conservative 58; Mismatches 131; Indels 88; Gaps • 15 

Qy 7 FSKSHKNPAEIVKILKD-NLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPTEA 65 

|:| : : | I I hi || |: :| : | : :; |: | | 

Db 232 FNKFREENQDFDKYLTDENIAFRPHVMKKFDEFAENIKKVIAELE GSKYKYPGLPG 287 



Qy 



66 VAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYISAHPHI 12 5 



I II h -I 11 = 1 : I : ' : 

Db 288 V LYFLGMEDAYSRYIELWKNEGEKGEEKLYNALI-ESLENRKENLEF 333 

0v 126 LFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFK YVELSTFDIA- 180 

Y | : : I I :|hl.l I III 1 = 

Db 334 : GITKKVID'KFIAQKEEFREFLKNYAVYYELSAFKLEK 370 

Qv 3 SDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSL 22 9 

| ::: -\\ I I' = = I : I =1= I 

Db 371 IKEQYEKEFINLDNIIKNPYILVED-LKEN DSFERIIFEELDSWERRRLGDKFNP 424 



QV 



23C KLLGELILDRH NFAIMTKYISKPENLKLMMNLLRDK8PNIQFEAF 274 

II I II II I Mi =11 = I h I 

Db 425 YSPYRVRALLVE- ILKRHLSSGNTTTSTK- - - - - rDLKDFFEKMDKDIVKITFDEFLRII 477 



0v 2^5 HVFKVFVAS PHKTQP I VE I LLKNQPKL I E FLS S FQKERTDDEQFADEKNYL I K 327 

:| :, | : : : : . |:' | !■ | : =: I =| : 1.1 hi 
Db 478 EE YKD IIS- - EKVE I VKKE VKNNENKE I I ELFTLKE IRE YEE I I ENT INYLLK 528 

RESULT 15 

'100246 7 
DNA polymerase V - fission yeast (Schizosaccharomyces pombe) 
C;Species: Schizosaccharomyces pombe 

C;Date:: Ol-Feb-1999 #sequence_revision Ol-Feb-1999 #text__change 31-Jan-2000 
C /Accession : T00246; T39442 

3himizu, K. ' ' ■ 

Submitted to the EMBL Data Library, March. 1993 

A; description: S. pombe homo log of S . cerev.lsiae DNA polymerase V. -.v^r 
A; Reference number : Z14129 " t: ~'-'- '' 

A; Accession : T00246 - ' " 

A; status : preliminary; translated from GB./EMBL/DDBJ . *, - 

A; Molecule type : mRNA 
A/Residues: 1-959 <SHI> 

^•Cross-references: EMBL : ABO 12 69.6 ; NID .: dl224325; PIDN : BAA32046 . 1 ; PID : dl03 3.008 
k;';;iang f Z.; Aves, S.; Lyne, M. ; Rajandream, M.A.; Barrell, B.G.; Volckaert, G 
submitted to the EMBL Data Library, March 1998 
A; Reference number: Z21854 
A; Accession: T3 9442 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A;Molecule type: DNA 
A; Residues: 1-959 <LYN> 

A; Cross-references : EMBL : AL0 2 2 3 05; PIDN : CAA18436 . 1 ; GSPDB : GN00067 ; . 
SPD3 : SPBC14C8 . 14c 

A; Experimental source: strain 972h--; cosmid cl4C8 
C; Genetics: 

A;Gene: !pol5+; SPBC14C8.14C 
A; Map position: 2 
A; Introns : 66/3 

Querv Match 7.0%; Score 118.5; DB 2; Length 959; 

Beat" Local Similarity 20.5%; Pred. No. 3.5; 

Matches 80; Conservative 63; Mismatches 13b; Indels 113; Gaps 19 

Cv 9 KSHKN PAEIVKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPP 52 

U || || :::|:: :|::| | || || := = : | | 

Db 522 KSPKNNLLISMDESVISIVQKSLSVLKKVTKKIDKKAQHL-QQLNAF 567. 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



Searcn 
o'ob tinvj 



63 TEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKD--VTQIFNNILRRQIGTRSPTVEYI- 119 

Mil l| I II I ■ : : II :|= = II I 

568 QLLYSLVLLQVYAGDTDSIDVLEDIDNCYSKVFNKKSKRESTSNEPTAMEIL 619 

120 SAHPHILF - -MLLKGY EAPQIALRC GIMLRECI 150 

: | : I II : III I : I s 

6 2 0 TEVMLSLLSRPSTjLLRKLVDMLFTSFSEDMNRES IHLICDVLKAKESVKDSEGMFAGEEV 67 9 

151 RHEPLAKI ILFSNQFRDFFKYVELSTFD I ASDAFATFKDLLTRHKVLVADFLEQNYDTI F 210 

580 EEDAFGE TEMDEDDFEEIDTDEIEEQSD WEMIGNQDASDNEELERKLDKVL 730 

211 EDYEKLLQ SENYVTKRQSLKL LGELILDRHNFAIMTKYISKPENLKLMMNLL 262 

" M : :: | : I I I I I : : I I I : I 

731 EDADAKVKDEESSEEELMNDEQMLALDEKLAEVFRER KKASNKEKKKNAQ 780 

263 RDKSPNIQFEAFHVFKV— FVASPHKTQ PIVEILLKNQPKLIE 303 

| :\\: || : : :||| | : : | : : i | : | : : | 

781 ETKQQTVQFKV KVIDLIDNYYKTQPNNGLGFEFLIPLLEMILKTKHKVLEEKGQAV 836 

304 FLSSFQKERTDDEQFADEKNYL- - IKQIRDL 332 

| : | : :|: I II ■ = I - I 

037 FRNRLS KLKWTEEK - PS S KNVLEALKKVHVL 866 

dieted: January 7, 2004, 16:46:10 - f :•• 

: 32 sees 1 
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Title: 
Perf ec 
Sequence : 



Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 

Searched: 747907 seqs, 201509753 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 
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Post- processing: Minimum Match 0% 
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Published_Applications_AA: * 

/cgn2_6 /ptodata/2 /pubpaa/US07_PUBCOMB . pep : * 
/cgn2_6/ptodata/2/pubpaa/PCT_NEW_PUB . pep : * 
/cgn2_6/ptodata/2/pubpaa/US06_NEW_PUB.pep:* 
/cgn2_6 /ptodata/2 /pubpaa/US06_PUBCOMB . pep : * 
/cgn2_6/ptodata/2/pubpaa/US07_NEW_PUB.pep:* 
/cgn2_6/ptodata/2/pubpaa/PCTUS_PUBCOMB.pep:^ 
/cgn2_6/ptodata/2/pubpaa/US08_NEW_PUB.pep:* 
/cgn2_6/ptodata/2/pubpaa/US08_PUBCOMB.pep:* 
/cgn2__6/ptodata/2/pubpaa/US09A_PUBCOMB.pep:^ 
/cgn2_6/ptodata/2/pubpaa/US09B_PUBCOMB.pep: 
/cgn2__6/ptodata/2/pubpaa/US09C_PUBCOMB.pep: 
/cgn2__6/ptodata/2/pubpaa/US0 9_NEW_PUB .pep : ' 
/cgn2_6/ptodata/2/pubpaa/US10A_PUBCOMB.pep: 
/cgn2_6/ptodata/2/pubpaa/US10B_PUBCOMB.pep: 
/cgn2_6/ptodata/2/pubpaa/US10C_PUBCOMB.pep: 
/cgn2_6/ptodata/2/pubpaa/US10 JtfEWJPUB . pep : ' 
/cgn2_6/ptodata/2/pubpaa/US60_NEW_PUB .pep : * 
/cgn2_6/ptodata/2/pubpaa/US60_PUBCOMB.pep- 



Pred No. is the number of results predicted by chance to have a 
score qreater than or equal to the score of the result being printed, 
and is" derived by analysis of the total score distribution. 
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ALIGNMENTS 



RESULT 1 
US-10-025-730-1 

; Sequence 1, Application US/10025730 

/ Publication No. US20030045466A1 

: GENERAL INFORMATION: 

; APPLICANT : Tang, Y . Tom 



APPLICANT: Guegler, Karl J. 
APPLICANT: Corley, Neil C. 
APPLICANT: Gorgone , Gina A. 

TITLE OF INVENTION: CALCIUM BINDING PROTEIN 
FILE REFERENCE: PF-06 3 5 US 

CURRENT APPLICATION NUMBER: US/ 10/025 , 730 
CURRENT FILING DATE: 2 001-12-18 
PRIOR APPLICATION NUMBER: US/ 09/190, 965 
PRIOR FILING DATE: 1998-11-13 
NUMBER OF SEQ ID NOS : 5 
SOFTWARE: PERL Program 
SEQ ID NO 1 
LENGTH : 3 3 7 
TYPE : PRT 

ORGANISM: Homo sapiens 
FEATURE : - 

OTHER INFORMATION: 3 734 805 
US-10-025-730-1 

• Query Match 100.0%; Score 1704; DB 15; Length 337; 

Best Local Similarity 100.0%; Pred. No. 3.1e-147; 

Matches 337; Conservative 0; Mismatches 0; Indels 0; :Gaps 0; 
rv ■. MKKNIPLFSKSHKNPAEIVKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKE 6 0 ; 

7 " ! I i 1 1 1 1 i 1 f 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 MIIIMMMIillMl.lillMIMII 

■ Db MKKMPLFSKSHKNPAEIVKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKE 6 0 

61 PPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYIS 120 

w: . ' MIIIIMIIIIIIMMIIIMIIMIIIIIIIIIMIIMIIMM IMIIII 

Db . 61 PPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYIS 120 

0v , 12 1 AHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDIA 18 0 

Y iMllllMilMllillililllliMMMIIIIllMIMIItliliMilliilll 

Db 12 l AIIPHILFMLLKGYEAPQI.ALRCGIMLRECIRHEPLAKI.ILFSNQFRDFFKYVELSTFDIA 18 0 

; rP1 SDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSLKLLGELILDRH 24 0 

Q> MINIUM 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 M M M 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 » 1 1 1 

Db i31 SDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSLKLLGELILDRH 240 

^ OAi NFAiMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQPK 300. 

■ ^ iMMIIIMIIIIIMIIIMIIIMillllllllMIMIMMIMMMIMIM.I 

Db 241 NFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQPK 300 

Ov - 301 L I EFLS S FQKERTDDEQFADEKNYL I KQ IRDLKKTAP 337 

1 1 M 1 1 1 ! i 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 i 1 1 1 1 1 1 1 

Db 301 L I EFLS S FQKERTDDEQFADEKNYL I KQ IRDLKKTAP 337 

RESULT 2 
'3-10-239-079-5 
Sequence 5, Application US/10239079 
Publication No. US2093 014 8446A1 
GENERAL INFORMATION: 
APPLICANT: Merck Patent GmbH 
TITLE OF INVENTION: ANIC-BP1 - ligand 
FILE REFERENCE: ANIC-BP-l-ligand 
CURRENT APPLICATION NUMBER: US/10/23 9 , 079 



; CURRENT FILING DATE: 2002-09-19 
; NUMBER OF SEQ ID NOS : 3 

SOFTWARE: Patentln Ver. 2.1 
; SEQ ID NO 5 

LENGTH: 4 96 
; TYPE: PRT 

ORGANISM: Artificial Sequence 

FEATURE : 

OTHER INFORMATION: Description of Artificial Sequence: Gal4 -ANIC-BP- 
OTHER INFORMATION : fusion protein 
US-10-239-079-5 



Query Match 81.0%; Score 1381; DB 12; Length 496; 

Best^Local Similarity 81.0%; Pred. No. 1.6e-117; 

Matches 273; Conservative 31; Mismatches 29; Indels 4; Gaps 



Qy 


4 


MPL-FSKSHKNPAEIVKILKDNLAILEKQ- - - DKKTDKASEE VS KSLQAMKE I LCGTNEK 


59 


Db 


156 


|| 1 lllhlhlll IM-hllll Ml Mhllllhl Mill! Mill 

MPFPFGKSHKSPADIVKNLKESMAVLEKQDISDKKAEKATEEVSKNLVAMKEILYGTNEK 


215 


Qy 


60 


EPPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYI 

M iiiiiMiiiihiiii ihiiiiiiiitiMii in ^ 

EPQTEAVAQLAQELYNSGLLSTLVADLQLIDFEGKKDVAQIFNNILRRQIGTRTPTVEYI 


119 


Db 


216 


275 


Qy 


120 


SAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDI 
MMMMMMhlM Ml 1 1 ! I 1 1 1 1 1 1 1 1 1 1 1 1 = 1 II MhMhlMM 


179 


Db 


27 6 


CTQQNILFMLLKGYESPEIALNCGIMLRECIRHEPLAKIILWSEQFYDFFRYVEMSTFDI 


335' 


Qy 


180 


ASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSLKLLGELILDR 

.MIIIIIMMMIIhl hlllhli 1 -Mill II 1 1 i M 1 II 1 M M II M 1 1 

ASDAFATFKDLLTRHKLLSAEFLEQHYDRFFSEYEKLLHSENYVTKRQSLKLLGELLLDR 


23 9 


Db 


336 


3 95 


Qy ; 


240 


HNFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQP 

||! 1 1 1 1 1 1 1 1 1 1 ! 1 M II II II M 1 1 II II 1 M II II II h 1 M II 1 MM II II 1 

HNFT I MTKY I S KPENLKLMMNLLRDKSRN I Q FE AFHVFKVFVANPNKTQP I LD I LLKNQA 


2 9,9 


Db 


396 


455 


Qy 


300 


KLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLKKTA 3 3 6 

Ml MM M Mhiill Ml 1 MM II MM 1 

KLIEFLSKFQNDRTEDEQFNDEKTYLVKQIRDLKRPA 4 92 




Db 


456 





RESULT 3 

US-10- 239-079-6 

• Sequence 6, Application US/10239079 
/Publication No. US20030148446A1 

; GENERAL INFORMATION: 

APPLICANT : Merck Patent GmbH 
; TITLE OF INVENTION: ANIC -BP1 - ligand 

FILE REFERENCE: ANIC-BP- 1 - ligand 
; CURRENT APPLICATION NUMBER: US/10/239 , 07 9 
■ CURRENT FILING DATE: 2002-09-19 

• NUMBER OF SEQ ID NOS: 8 

; SOFTWARE : Patentln Ver . 2.1 
; SEQ ID NO 6 
; LENGTH :- 552 
TYPE : PRT 

07iGANISM: Artificial Sequence 



FEATURE : 

OTHER INFORMATION: Description of Artificial Sequence: LexA-ANIC-BP-1 
OTHER INFORMATION: fusion protein 
US-10-239-079-6 

Query Match 81.0%; Score 13 81; DB 12; Length 552- 

Best Local Similarity 81.0%; Pred. No. 1.9e-117; 

Matches 273; Conservative 31; Mismatches 29; Indels 4; Gaps 



Qy 


4 


MPL-FSKSHKNPAEIVKILKDNLAILEKQ DKKTDKASEEVSKSLQAMKEILCGTNEK 

II | | II 1 : 1 1 • 1 II 1 1 : : : 1 : 1 1 1 1 Ml = 1 h 1 1 1 1 1 = 1 II 1 1 1 1 1 1 1 1 1 

MPFPFGKSHKSPADIVKNLKESMAVLEKQDISDKKAEKATEEVS KNLVAMKEILYGTNEK 


59 


Db 


212 


271 


Qy 


60 


SPPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYI 

i ■ i i i i i i i i i i l l l 1 t 1 1 1 1 I 1 • i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 


119 


Db 


272 


1 1 1 1 1 1 1 1 1 1 1 II | : 1 1 II 1 h 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 II 1 M 1 1 1 1 1 1 1 h 1 1 II 1 1 

E PQTE AVAQL AQEL YNS GLL S TL VADLQL I D FEGKKD VAQ I FNN 1 LRRQ I GTRTPT VE Y I 


331 


Qy 


120 


SAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDI 

:||||||IMhhMI 1 1 1 1 1 1 i 1 1 i 1 1 1 1 1 i 1 1 1 = 1 II MhllMMI! 

CTQQNILFMLLKGYESPEIALNCGIMLRECIRHEPLAKIILWSEQFYDFFRYVEMSTFDI 


179 


Db 


332 


391 


Qy 


180 


ASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSLKLLGELILDR 

illllMMIIIIIIhl |:|IM:|I 1 :.IMII 1 1 1 1 1 1 1 1 1 1 1 1 M ! 1 M 1 1 


239 


Db 


3 92 


ASDAFATFKDLLTRHKLLSAEFLEQHYDRFFSEYEKLLHSENYVTKRQSLKLLGELLLDR 


451 


Qv 


24 0 


HNFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQP 

Ml UMIIillliiiMIMIllli illlllllMlllihhIIMI-llllll 

HNFTIMTKYISKPENLKLMMNLLRDKSRNIQFEAFHVFKVFVANPNKTQPILDILLKNQA 


2 9 9- 


ju'.u 


452 


51-1 


Qy 


300 


KL IEFLSSFQ KE RTD D E Q FAD E KN Y L I KQ I RD L KKT A 336 




Db 


512 


MIIMI II Uhllii III 11 = ill 1 111= 1 

KL I EFLS KFQNDRTEDEQFNDEKTYLVKQ IRDLKRPA 54 8 





RHSULT 4 

US -10-025-730-3 

; Sequence 3, Application US/10025730 

; Publication No. US20030045466A1 

; GENERAL INFORMATION: 

; APPLICANT: Tang, Y. Tom 

; APPLICANT: Guegler, Karl J. 

; APPLICANT: Corley, Neil C. 

APPLICANT: ' Gorgone, Gina A. 
; TITLE OF INVENTION: CALCIUM BINDING PROTEIN 

FILE REFERENCE: PF-0635 US 
; CURRENT APPLICATION NUMBER: US/ 10/ 025 , 73 0 
; CURRENT FILING DATE: 2001-12-18 
; PRIOR APPLICATION NUMBER: US/ 09/190, 965 
; PRIOR FILING DATE: 1998-11-13 
; NUMBER OF SEQ ID NOS : 5 

SOFTWARE: PERL Program 
; SEQ ID NO 3 

LENGTH: 341 

TYPE : PRT 

ORGANISM: Mus sp . 

FEATURE : - 

OTHER INFORMATION: g262934 



Query Match 80.8%; Score 1376; DB 15 

Best Local Similarity 80.7%; Pred. No. 2.8e-117 
Matches 272; Conservative 32; Mismatches 29 



Length 3 41; 

Indels 4; Gaps 



Ov 4 MPL-FSKSHKNPAEIVKILKDNLAILEKQ---DK^ 59 

Q/ I I I I I I : | | : | | | ||:::|:MM III =11 = 11111 = 1 Ml II I MM 

Db x MPFPFGKSHKSPADIV^ 60 

60 EPPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYI 119 

Qy eo ^f^^? |1|:MM : | || mini | Ml I Nihil I Mi 

Ov ] 2 0 SAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDF^ "9 

Q/ . 1 1 1 1 1 1 1 1 1 1 . 1 r 1 1 1 IIMMMIM-MIMMhl II 1 1 h 1 1 h 1 1 1 1 1 

Db 121 CTQQNXLFMLLKGYES 180 

Qv , 180 aSDAFATFKDLLTRHKVIjVADFLEQNYOT 239 

1 1 ; 1 1 1 1 1 1 1 1 1 1 1 i I : | | : h I MINI I M 1 1 1 M M M M M I • I -II 

Db 181 lllllk?^ 240 

Ov 240 HNFAIMTKY I S KPENLKLMMNLLRDKSPNIQFEAFHVFKVFVAS PHKTQP I VE IL^F3STQP 299 

Qy mi I > 1 1 i 1 1 ! ! 1 1 1 1 11 M 1 1 1 1 M III MM IIIMIMMIIIhMMIM 

.41 HNFTIMTKYISKPE^ 3 °° 



Db 



300 KLIEFLSSFQKERTDDEQFAD'EICNYLIKQIRDLKKTA 336 

i HI II I II Ml Mill II! II Mill MM I 

3 01 ICLIEFLSKFQNDRTEDEQFNDEKTYLVKQIRNLKRAA 337 



RESULT 5 ■ : 

T7t;-10-02B-730"4 

Sequence 4, Application US/10025730 
; Publication No. US20030045466A1 
; GENERAL INFORMATION: 
; APPLICANT: Tang, Y. Tom 
; APPLICANT: Guegler, Karl J. 

APPLICANT: Cor ley, Neil C. 

APPLICANT: Gorgone , Gina A*. 
; TITLE OF INVENTION: CALCIUM BINDING PROTEIN 
: FILE REFERENCE: PF-0635 US 

; CURRENT APPLICATION NUMBER: US/ 10/ 02 5 , 73 0 

■ CURRENT FILING DATE: 2001-12-18 

■ PRIOR APPLICATION NUMBER : US/ 0 9/ 190 , 965 
; PRIOR FILING DATE: 1998 -11- 13 

; DUMBER OF SEQ ID NOS : 5 

SOFTWARE: PERL Program 
; SEQ ID NO 4 

LENGTH: 33 9 

TYPE : PRT 

ORGANISM: Drosophila melanogaster 
FEATURE: - 

OTHER INFORMATION: gl7 94l3 7 
US-10-025-730-4 

Query Match 65.1%; Score 1109; DB 15; Length 339; 



Db. 

Qy 

Db 



Best Local Similarity 65.0%; Pred. No. 6.5e-93; 

Matches 217; Conservative 59; Mismatches 54; Indels 4; Ga P o 
Ov 4 MPLFSKSHKNPAEIVKILKI)NLAILEKQDKKTDKASEE 63 

II 1 1 II hi HI 1 1 : : M hi : 1 1 I - M 1 = I : : I : ! I : : : Ml 

Db 1 ipipGKSQKSPVELVKSLKEAINALEAG 60 

64 E--AVAQLAQELYSSGLLVTLIADLQLIDFE^ 122 

7 - III hi II hi I h II M Mill 1 1 I Nihil II I II II III II I 

61 DYWAQLSQELYNSNLLL 120 
123 PHILFMLLKGYE- - APQIALRCGIMLRECIRHEPLAKI ILFSNQFRDFFKYVELSTFDIA 180 

I 1 1 1 h 1 1 1 hill I Mill hi III hi I- I Ihll hill III 

121 PEILFTLMAGYEDAH 180 
Qv 181 SDAFATFKDLLTRHKVLVADFLEQNYDTIF-EDYEKL^ 239 

1 1 1 I ♦ 1 1 1 - M 1 1 1 h I h I h I : h M I I II II h II 1 1 II I II h M I 

Db 181 SDAFSTFKELLT^ 240 

240 HtfF^IMTKYISKPENL^ 2 " 

Ml" - Ihll hill Mill hMMMIIMM II Mill h hi MhMlhM 

Db 241 HNFTVlllTRYIS^ 300 

,00 KLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLK 333 

° Y M : : | |: : I M::MM Ml I I I I I I 5 : I I 

nb ?fG l KLVDFLTNFHTDRS.EDEQFNDEKAYLIKQIKELK 334 

RESULT o J . ... 

US-10-025--730-5 " . 

, sequence 5, Application US/10025730 

; Publication No. US20030045466A1 

■ GENERAL INFORMATION: 
Tang, Y. Tom 

Guegler, Karl J. 
Corley, Neil C. 

rvl Gorgone, Gina A. \; ; 

TITLE OF INVENTION: CALCIUM BINDING PROTEIN 
; FILE REFERENCE: PF-0635 US 

■ CURRENT APPLICATION NUMBER: US/10/025 , 730 

• CURRENT FILING DATE: 2001-12-18 

• PRIOR APPLICATION NUMBER: US./ 0 9/ 190 ,965 
; PRIOR FILING DATE: 1998-11-13 

; NUMBER OF SEQ ID NOS : 5 

SOFTWARE : PERL Program 
; SEQ ID NO 5 

LENGTH: 377 

TYPE: PRT 

• ORGANISM: Caenorhabditis elegans 

FEATURE : - 

OTHER INFORMATION: gl255838 

U. c :--10"025-730-5 

Query Match 62.4%; Score 1063.5; DB 15; Length 377; 

Best-Local Similarity 60.5%; Pred. No. l-le-88; _ , 

Matches 211; Conservative 53; Mismatches 68; Indels 17, Gaps . - 



APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 



Qy 



& MP-LFSKSHKNPAEIVKILKDNLAILEK QDKKTDKASEEVSKSLQAM 4 9 

II II II I M I M M I I - I I I : I Ml Ml MMM = 

Db i MPLLFGKSHKS PADVVKTLREVLTILDKLPPPKLDKDGNIQSDKKYDKALDEVSKNVAMI 60 



0v so KEILCGTNEKEPPTE AVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILR 106 

| : | : || :| I I I I I I I' = I s = : I II I : I I MM Ml 

Db 61 KSFIYGNDSAEPSSEHWQVAQLAQEVYNANILPMLIKMLPKFEFECKKDVGQIFNNLLR 120 

Qv i 0 7 RQIGTRSPTVEYISAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFR 166 

1 1 1 1 M 1 1 II I M I I II MM I I Mill MM I MM MM I 

Db 12 l RQIGTRSPTVEYLGARPEILIQLVQGYSVPDIALTCGLMLRESIRHDHLAKIILYSDVFY 180 

Qv igv DFFKYVELSTFDIASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKR 226 

IMM M Ml II Ml I Ml MM -I MM MM I M II MMMM 

Db 131 TFFLYVQSEVFDISSDAFSTFKELTTRHKAIIAEFLDSNYDTFFAQYQNLLNSKNYVTRR 240 

0v 227 OSLKLLGELILDRHNFAIMTKYISKPENLKLMMNLLRDKS PNIQFEAFHVFKVFVASPHK 2 86 

M 1 1 M MM 1 1 1 1 1 1 ! 1 1 1 1 I M M I II I 1 1 M 1 1 II 1 1 1 1 1 1 M M I 

Db 241 qslkLLGELLLDRHNFNTMTKYISNPDNLRLMMELLRDKSRNIQYEAFHVFKVFVANPNK 300 

0v 037 TQPIVEILLKNQPKLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLKKT 335 

Ml Ml M: MMMM M I II I II 1 1 1 M II I M = M = 

Db 301 PKP I SD ILNRNREKLVEFLSEFHNDRTDDEQFNDEKAYL I KQI QEMKSS 3 4 9- 



RESULT 7 

US -10 -02 9- 3 86- 3 23 24 

; Sequence _}2324, Application US/10029386 

: Publication No. US20030194704A1 

; GENERAL INFORMATION: 

; APPLICANT : Penn, Sharron G . 

: APPLICANT : Rank, David R. 

APPLICANT: Hanzel, David K. 
: TITLE OF INVENTION: HUMAN GENOME -DERIVED SINGLE EXON NUCLEIC ACID PROBES 

USEFUL FOR GENE 

: TITLE OF INVENTION: EXPRESSION ANALYSIS TWO 
FILE REFERENCE: AEOMICA-X-2 

CURRENT APPLICATION NUMBER: US/ 10/02 9 , 3 86 
; CURRENT FILING DATE: 2001-12-20 
; NUMBER OF SEQ ID NOS : 34288 

SOFTWARE: Annomax Sequence Listing Engine vers. 1.1 
; SEQ ID NO 32324 
; LENGTH: 82 0 

TYPE: PRT 
; ORGANISM: Homo sapiens 
; FEATURE : 

OTHER INFORMATION: MAP TO AC000066.1 
; OTHER INFORMATION: EXPRESSED IN HELA, SIGNAL = 0.87 

■ OTHER INFORMATION: EXPRESSED IN ADULT LIVER, SIGNAL = 1.4 
OTHER INFORMATION: EXPRESSED IN BRAIN, SIGNAL =1.1 

■ OTHER INFORMATION: EXPRESSED IN HEAR.T, SIGNAL = 1.6 
; OTHER INFORMATION : EXPRESSED IN LUNG, SIGNAL = 1.3 

; OTHER INFORMATION: EXPRESSED IN BONE MARROW, SIGNAL =1.3 
; OTHER INFORMATION: SWISSPROT HIT: Q99996, EVALUE 0 . 00e+00 
US -10 -02 9- 3 86 -3 2324 



Query Match 



7.5%; Score 128.5; DB 12; Length 820; 



Best Local Similarity 20.1%; Pred. No. 0.0072; 

Matches 77; Conservative 75; Mismatches 116; Indels 115; Gaps 15; 

Qv 18 VKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPTEAVAQLAQELYSSG 77 

- ! M : I hh - h Ih M I 

Db 358 lEKLKDNLGIHYKQ- -QIDGLQNEMSQKIETMQ FEKDNLITKQNQLILE 404 

0v 78 LLVTLIADLQ- -LIDFEGKKDVTQIFNNILRRQI GTRS PTVE YT S AHPH I 125 

:: : ||| |:: : :: III h-l M h = 

Db 405 - -ISKLKDLQQSLVNSKSEEMTLQI- -NELQKEIEILRQEEKEKGTLEQEVQELQLKTEL 460 

Qv 1-6 LFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDIASDAFA 185 

I =| I | | :| | | 

Db 4i 5i LEKQMKEKE NDLQEKFAQLEAEN - S ILKDEKK 491 

Qv lg g TFKDLLTRH KVLVADFLE - QNYDT I FEDYEKLLQSENYVTKRQSLKLLGEL IL 237 

| : | : | | :::! :::: | ::: | ::| || hi ::| h 

Db 492 TLEDMLKIHTPVSQEERLIFLDS J.KSKSKDSVWEKEIEILIEENEDLKQQCIQLNEEIEK 551 



Qy 238 DRHNFAIMTK YISKPENLKLMMNLLRD 264 

1= |: I I II : I : : I I 

Db 552 QRNTFSFAEKNFEVNYQELQEEYACL^KVKDDLEDSKNKQELEYKSKLKALNEELHLQRI 611 

0v 2S5 KSPNJQFEA--FHVFKVFVASPHKTQPIVEILLKNQPKLIEFLSSFQKERTD-DEQFAD- 320 

: : : : | MM : I • I - = h =1-1 I : : l : : : : : I 
Db f,i2 NPTTVKMKSSVFDEDKTFVA ETLEMGEWEKDTTELMEKLEVTKREKLELSQRLSDL 668 

q- r "v">l EKNYLIKQIRDLKK 334 

| ::| :::: ||: 
Db 669 SEQLKQKHGEISFLNEEVKSLKQ- 691 

RESULT 3 

US -10-080-60 8A- 11 

; Sequence 11, Application US/10080608A 
; Publication No. US20030198956A1 
; GENERAL INFORMATION: 

Makowski , Lee 
Hyman, Faul 

^ Williams, Mark ** 

ITLE OF INVENTION: STAGED ASSEMBLY OF NANO STRUCTURES 
; FILE REFERENCE: 8471-010-999 

■;• CURRENT APPLICATION NUMBER : US/10/080 , 608A 
; CURRENT FILING DATE: 2002-02-21 
; NUMBER OF SEQ ID NOS : 180 

SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 11 

LENGTH: 3 878 

TYPE : PRT 

ORGANISM: Homo sapiens 
US-1C-080-608A-11 

Query Match 7.5%; Score 12 8.5; DB 12; Length 3 878; 

Best Local Similarity 20.1%; Pred. No. 0.065; 

Matches 77; Conservative 75; Mismatches 116; Indels 115; Gaps 15; 
Qy is VKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPTEAVAQLAQELYSSG 77 



APPLICANT 
APPLICANT 
APPLICANT 

rpT 



664 IEKLKDNLGIHYKQ- -QIDGLQNEMSQKIETMQ - -FEKDNLITKQNQLILE 710 
78 LLVTLIADLQ- -L'IDFEGKKDVTQIFNNILRRQI — — — -GTRSPTVEYISAHPH1 125 
ISKLKD'liQQSLVNSKSEEMTliQI -■ -NELQKEIEILRQEEKEKGTLEQEVQELQIjKTEL 766 
■136 LFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDIASDAFA 185 

767 LqmLkE NDLQEKFAQLEAEN - S I LKDEKK 797 

i86 TFKDLLTRH KVLVADFLE-QNYDTIFEDYEKLLQSENYVTKRQSLKLLGELIL 237 

798 iLEDMLlLpVSQEERLIFLLiKSKSKisVWEKEIEILIEENEDLKQQClQLNEEIEK 857 

YISKPENLKLMMNLLRD 264 

q y 238 DRHNFAIMTK || I : I : : H 

Db - . 858 qrnTFSFAEKNFEWYQELQEEYACLLKVKDDLEDS 917 

Qy 265 .KSPNIQFEA- -FHVFKVFVASPHCT 320 

918 NPTTV^KSSVFDEDKTFVA- - -eIlEMGEWEKDTTELMEKDEVTKREKLELSQRLSDL 974 



Db 

Qy 

Db vii 

Qy 

Db 

Qy 

Db 



Db 

Qy 

Db 



32i EKNYLIKQIRDLKK 3 34 

! : = | 11 = 

375 S EQLKQKHGE I S FLNEEVKSLKQ 997 



-171 -3 11-4 ' 
r Sequence 4 f Application US/10171311 
; Publication No. US20030087270A1 
; GENERAL INFORMATION: 
; APPLICANT: Schlegel, Robert 
; APPLICANT: Chen, Yan 

j APPLICANT: Zhao, Xumei " . 

APPLICANT: Monahan, John 
'; APPLICANT: Kamatkar, Shubhangi 
• APPLICANT: Glatt, Karen ■ 

APPLICANT: Gannavarapu, Manjula 

" ^^1^^' NdVEL^ GENES ,/ COMPOSITIONS , KITS, AND METHODS FOR^ 

l_i_TLE OF. INVENTION. iDENTIFICATI0Nf ASSESSMENT, PREVENTION , AND THERAPY 

OF CERVICAL CANCER 



TITLE OF INVENTION 
TITLE OF INVENTION 



FILE REFERENCE: MRI-03 5 
CURRENT APPLICATION NUMBER: US/lQ/171 , 311 
CURRENT FILING DATE: 2002-06-12 
PRIOR APPLICATION NUMBER: US 60/298,159 
PRIOR FILING DATE: 2001-06-13 
PRIOR APPLICATION NUMBER : US 60/298,155 
PRIOR FILING DATE: 2 001-06-13 
PRIOR APPLICATION NUMBER: US 60/335,936 
PRIOR FILING DATE: 2001-11-14 
NUMBER OF SEQ ID NOS : 23 8 

SOFTWARE: FastSEQ for Windows Version 4.0 

SEQ ID NO 4 
LENGTH: 3 899 



TYPE: PRT 

ORGANISM: Homo sapiens 
US-1C-171-311-4 



Query Match 7.5%; Score 12 8.5; DB Length 3899; 

Best Local Similarity 20.1%; Pred. No. 0,066; 

Matches 77; Conservative 75; Mismatches 116; Indels 115; Gaps 15, 
Qy 18 VKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPTEAVAQLAQELYSSG 77 

Db 652 IEKLKDNLGIHYKQ- -QIDGLQNEMSQKIETMQ- FEKDNLITKQNQLILE 698 

78 LLVTL I ADLQ - - L I DFEGKKDVTQ I FNN ILRRQ I GTRS PTVEY I S AHPHI 125 

599 _ -isKLKDLQQSLVNSKSEEMTLQI- -NELQKEIEILRQEEKEKGTLEQEVQELQLKTEL 754 
126 LFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDIASDAFA 185. 

755 LkQmJcEKE- — , — - NDLQEKFAQLEAEN-SILKDEKK 785 

136 TFKDLLTRH KVLVADFLE-QNYDTIFEDYEKLLQSENYVTKRQSLKLLGELIL 237 

786 iLEDMiKIOTPVSQEERLIFLLlKSKSKDSVWEKEIEILIEENEDLKQQCIQLNEEIEK 845 

YISKPENLKLMMNLLRD 264 

23 8 DRHNFAIMTK | | I • I = - I I 

846 qrntFSFAeLfEVNYQELQEEYACLLKVKDDLEDSKNKQELEYKSKLKALNEELHLQRI 905 

■265 -<SPNiQFEA--FHVFK7FVASPHKTQPTVEILLI^QPKLIEFLSSFQKERTD-DEQFAD- 320 
906 NPTTVKMKSSVFDEDKTFVA- - -EIXEMGEWEKDTTELMEKLEVTKREKLELSQRLSDL 962 

3 -i.i - EKNYLIKQIRDLKK 334 

Db 963 SEQLKQKHGEISFLNEEVKSLKQ 985 

RESULT 10 . , 

5-10-171-311-2 

Sequence 2, Application US/10171311 
Publication No. US20030087270A1 
GENERAL INFORMATION: 
APPLICANT: Schlegel, Robert 
APPLICANT : Chen , Yan 
APPLICANT: Zhao, Xumei 
APPLICANT: Monahan, John 
APPLICANT : Kamatkar , Shubhangi 
APPLICANT: Glatt, Karen 
APPLICANT : Gannavarapu, Manjula 

N^GENES/COMPOSITIONS, KITS, AND METHODS FOR 
TITLE OF iZSiOn': IDENTIFICATION , ASSESSMENT, PREVENTION, AND THERAPY 
TITLE OF INVENTION: OF CERVICAL CANCER 
FILE REFERENCE: MRI-03 5 

CURRENT APPLICATION NUMBER : US/lO/171 , 311 
CURRENT FILING DATE: 2002-06-12 
PRIOR APPLICATION NUMBER : US 60/298,159 



Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 



PRIOR FILING DATE: 2001-06-13 
PRIOR APPLICATION NUMBER : US 60/298,15 5 
PRIOR FILING DATE: 2 0 01-06-13 
PRIOR APPLICATION NUMBER: US 6 0/335,936 
PRIOR FILING DATE: 2001-11-14 
NUMBER OF SEQ ID NOS : 238 

SOFTWARE: FastSEQ for Windows Version 4.0 
SEQ ID NO 2 
LENGTH: 3 9 07 
TYPE : PRT 

ORGANISM: Homo sapiens 
US-10-171-311-2 

Query Match 7.5%; Score 128.5; DB 15; Length 3907; 

Best Local Similarity 20.1%; Pred No. 0.066; 

Matches 77; Conservative 75; Mismatches 116; Indels 115, Gaps 15, 



Qy 


18 


«T, rt «T/vmnvn r-TTPuc^cT nfl MTTT? TT .PflTNEKEPPTE AVAQLAQELYS SG 
VKILKDNLAILEKQDKKTDKASEEVSKbLQANKblli^OLiNjarv^^^x^v^v ^ 

•• Mill 1 || : 1 |:|s- :: |s |h Ml. 


77 


Db 


652 


IEKLKDNLGIHYKQ- -QIDGLQNEMSQKIETMQ FEKDNL I TKQNQL ILE 


D jy O 


Qy 


78 


LLVTLIADLQ- -LIDFEGKKDVTQIFNNILRRQI — GTRS PTVE Y I S AHPH I 


125 


Db 


699 


- - ISKLKDLQQSLVNSKSEEMTLQI - -NELQKE J E IIjRQEEKEKGTLEQEVQELQLKTE'Jj 


754 




126 


LFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKWELSTFD|ASDAFA 


18 5 


Db 


755 




7 85 


Qy 


.185 


TFKDLLTRH KVWADFLE - ONYDTI FEDYEKLLQSENYVTKRQSLKLLGELIL 

! . 1 . 1 1 : : .• | : : : : | : : : | = : I 1 1 1 = 1- : : 1 .1 : 


23 7. 


Db 


736 


TLEDMLKIHTPVSQEERLIFLDSIKSKSKDSVWEKEIEILIEENEDLKQQCIQLNEEIEK 


845 


.*>■: r 


233 


i. i. 1 i II : 1 S! 'l l 


264 


Db 


846 


QRNTFSFAEKNFEVNYQELQEEYACLLKVKDDLEDSKNKQELEYKSKLKALNEELHLQRI 


9 05 


Qy 


265 


KSPNIQFEA- -FHVFKVFVASPHKTQPIVEILLKNQPKLIEFLSSFQKERTD-DEQFAD- 


320 


Db 


906 


.... | | | | | : | : | : : ! = • \ ■ \ \ ■ ■ \ ■ : : : 1 
NPTTVKMKS S VFDED KTFVA - - -ETLEMGEWEKDTTELMEKLEVTKREKLELSQRLSDL 


962 


Qy 


321 


EKNYLIKQIRDLKK 334 




Db 


963 


| ::| :::= 11 = 
SEQLKQKHGEISFLNEEVKSLKQ 985 





RESULT 11 
US-10-370-685-100 

Sequence 100, Application US/10370685 
Publication No. US20030215903A1 
GENERAL INFORMATION: 
APPLICANT: Hyman, Paul 
APPLICANT: Goldberq, Edward 

TITLE OF INVENTION: Nano structures Containing PNA Joining and Functional 

lements 

FILE REFERENCE: NANF.P-004 

CURRENT APPLICATION NUMBER: US/ 10 / 37 0 , 685 



CURRENT FILING DATE : 2003-02-21 
PRIOR APPLICATION NUMBER: 10/080,608 
PRIOR FILING DATE: 2002-02-21 
NUMBER OF SEQ ID NOS : 159 
SOFTWARE: Patentln version 3.2 
SEQ ID NO 100 
LENGTH: 3 911 
TYPE : PRT 
ORGANISM: human 
US-10-370-685-100 



Query Match 7.5%; Score 128.5; DB 12; Length 3911; 

Best Local Similarity ^ 0 ^ : ^^^lel [ 16 ; Inde ls 115; Gaps 15; 
Matches 77; Conservative /5; Mismatcnes j-xo, 

13 VKILKDNIATLSKQPKKTDKASEEVSKSLQAMKEILCGTNEKEPPTEAV 77 

664 IEKlULiHyIcI -QlioLQNiMSQKiETiQ FEKDNLITKQNQLILE- - - - - 710.- 

78 LLVTLIADLQ-LIDFEGKKDVTQ1FNNILRRQI - - — — -GTRSPTVEYISAHPHI 125 
711 - - ISKLKDLQQSLVNSKSEEMTLQI - -NELQKEIEILRQEEKEKGTLEQEVQELQLKTEL 766 
l 3fi LFMLLKGYEAPQI ALRCGIMLRECIRHEPLAKI ILFSNQFRDFFKYVEL3TFDI ASDAFA 185- 

, v."/ LkQmIeKE -NDLQEKFAQLEAEN-SILKDEKK 

• V ; C TFKDLLTRH KVLVADFLE-QNYDTIFEDYEKLLQSENYVTKRQSLKLLGELIL 237' 

^ - 798 ttjEDmIkihtpvsqeerlifldsikskskd 857 

yiskpenlklmmnllrd 2 6.4 

qv 2 3 fi- DRHNFAIMTK | || * I :: l I 

D - & 358 qUtFSFAeLfEVNYQELQEEYACLLKVKDDLEDSKNKQELEYKSKLKALNEEMLQRI 917 

,, y 265 KSPN1QFEA--FHVFKVFVASPHKTQPIVE1™ 320 
to 918 npt^KSSvLedLUI- - -ETLEMGEVVEKDTTELMEKLEVTKREKLELSQRLSDL 974 
-> 21 EKNYLIKQIRDLKK 334 



Qy 

Db 

Qy 

Db 

Qy 

Db 



Qy 



Db 975 SEQLKQKHGEISFLNEEVKSLKQ 997 



RE 
US 



3ULT 12 
-10-171 311-8 

Sequence 8, Application US/10171311 
Publication No. US20030087270A1 
GENERAL INFORMATION: 
APPLICANT: Schlegel, Robert 
APPLICANT: Chen, Yan 

Zhao, Xumei 
Monahan , John 
Kamatkar , Shubhangi 
Glatt , Karen 
Gannavarapu , Man j ula 
Hoersh, Sebastian 



APPLICANT: 
APPLICANT: 
APPLICANT : 
APPLICANT: 
APPLICANT: 
APPLICANT : 



TITLE OF INVENTION: OF CERVICAL CANCER 
FILE REFERENCE: MRI-035 

CURRENT APPLICATION NUMBER : US/10/171 , 311 
CURRENT FILING DATE: 2002-06-12 
PRIOR APPLICATION NUMBER: US 60/298,15 9 
PRIOR FILING DATE: 2001-06-13 
PRIOR APPLICATION NUMBER: US 60/298,15 5 
PRIOR FILING DATE: 2001-06-13 
PRIOR APPLICATION NUMBER: US 60/335,936 
PRIOR FILING DATE: 2001-11-14 
NUMBER OF SEQ ID NOS : 23 8 

SOFTWARE: FastSEQ for Windows Version 4.0 
SEQ ID NO 3 
LENGTH: 3 917 
TYPE : PRT 

ORGANISM: Homo sapiens 
US-10-171-311-8 

ouery Match 7.5%; Score 128.5; DB 15; Length 3 917; ■ 

Best Local Similarity 20.1%; Pred No 0.066; Gaps 15; 

Matches 77; Conservative 75; Mismatches 116; Indela .15, Gaps 

Qv 18 VKILKDNLAILEKQDKKTOKASEEVSKSLQAMKEILCGTNE^PPTEAVAQLAQEL 77 

m , S52 TEKIiKDNLGIHYKQ- -QIDGLQNEMSQKIETMQ -FEKDNLITKQNQLILE 698: 

0 V 78 LLVTLIADLQ- -LIDFEGKKDVTQIFNNILRRQI - -GTRSPTVEYISAHPHI 12 5 

:Db ' ., 99 - - ISKLKDLQQSLWSKSEEMTLQI - -NELQKEIEILRQEEKEKGTLEQEVQELQLKTEL 754 

Qy ' ,26 LFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVEIjSTFDTASPAFA 185 

Db : ,55 LEKQmLkE — — NDLQEKFAQLEAEN-SILKDEKK 785 

186 TFKDLLTRH KVLVADFLE - QNYDT I FEDYEKLLQSENYVT^QSLKLLGEL IL 237 

786 i L EUKlLpVSQEERLiFLLiKSKSKDSvWEKElEILIEE N EDLKQQCIQLNEEIEK 845 



Qy 

Db 

_ YISKPENLKLMMNLLRD 264 

Qy 238 DRHNFAIMTK | I I • I : : I I 

Db 845 qUtIsFAeLfEVNYQELQEEYACLLKVKDDLEDSKNKQELEYKSKLKALNEELHLQRI 905 



Qy 

Db 



5 KSPNIQFEA- -FHVFKVFVASPHKTQPIVEILL^QPKLIEFLSSFQKERTD-DEQFAD- 320 
906 NPTTVKMKSSVFDEDKTFVA eTLEMGEWEKDTTELMEKLEVTKREKLELSQPLSDL 962 

-591 E KNYL I KQ I RDLKK 334 

& 3 | : = | lh 

Db 963 SEQLKQKHGE I SFLNEEVKSLKQ 985 



RESULT 13 

US-10-171-311-6 

; Sequence 6 , Application US/10171311 
' Publication No. US20030087270A1 



GENERAL INFORMATION: 
APPLICANT: Schlegel, Robert 
APPLICANT: Chen, Yan 
APPLICANT: Zhao, Xumei 
APPLICANT : Monahan , John 
APPLICANT: Kamatkar, Shubhangi 
APPLICANT: Glatt , Karen 
APPLICANT: Gannavarapu, Manjula 
APPLICANT: Hoersh, Sebastian 

TITLE OF INVENTION: NOVEL GENES, COMPOSITIONS, KITS, AND METHODS FOR 
TITLE OF INVENTION: IDENTIFICATION, ASSESSMENT, PREVENTION, AND THERAPY 
TITLE OF INVENTION: OF CERVICAL CANCER 
FILE REFERENCE: MRI-035 

CURRENT APPLICATION NUMBER: US/10/ 171 , 3 11 

CURRENT FILING DATE: 2 0 02-06-12 

PRIOR APPLICATION NUMBER : US 60/298, lb9 

PRIOR FILING DATE: 2001-06-13 

PRIOR APPLICATION NUMBER: US 60/2 98,155 

PRIOR FILING DATE: 2001-06-13 

PRIOR APPLICATION NUMBER : US 60/335,936 

PRIOR FILING DATE: 2001-11-14 

NUMBER OF SEQ ID NOS : 238 

SOFTWARE : FastSEQ for Windows Version 4.0 
SEQ ID NO 6 

LENGTH : 3 92 5 . , . " 

TYPE : PRT 

ORGANISM: Homo sapiens '." 
US -1.0 -171-311-6 

O— ry Match 7.5%; Score 128.5; DB 15; Length 3925; . • - : . 

-Z'^t-" Local Similarity 20.1%; Pred. No. 0.067; ; ' • 

itches 77; Conservative 75; Mismatches 116; Indels Ll5; Gaps Is; 

... 18 vKrLKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPTEAVAQLAQELYSSG 77 

•• Mill I II =1 hh :: 1= II s 11 1 

Db 652 IEKLKDNLGIHYKQ- -QIDGLQNEMSQKIETMQ FEKDNLITKQNQLILE- 698- 

0 .> . 8 ijjVTLIADLQ - - LIDFEGKKDVTQIFNNILRRQI GTRSPTVEYISAHPHI 125 

' ...Mil::::: I I I I = = = I I I I : : 

Db 699 - - ISKLKDLQQSLVNSKSEEMTLQI - -NELQKEIEILRQEEKEKGTLEQEVQELQLKTEL 754 

Qy 126 LFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDIASDAFA 185 
Db 75 5 LEKQMKEKE- -NDLQEKFAQLEAEN - SILKDEKK 785 

<*• w36 TFKDLLTRH KVLVADFLE-QNYDTIFEDYEKLLQSENYVTKRQSLKLLGELIL 237: 

' 1 ~ I . I . I I : : : |. : : : : | : : : | : = I II ! : I : : I I : 

Db 736 TLEDMLKIHTPVSQEERLIFLDSIKSKSKDSVWEKEIEILIEENEDLKQQCIQLNEEIEK 845^ 

232 DRHHFAIMTK " YISKPENLKLMMNLLRD 264 

I . j . I I I I : I : : I I 

346 QRNTFS FAEKNFE VNYQELQEEYACLLKVKDBLEDS KNKQELE YKS KLKALNEELHLQRI 905 

265 KSPNIQFEA- -FHVFKVFVASPHKTQPIVEILLKNQPKL1EFLSSFQKERTD-DEQFAD- 320 

.... | ; | | | : | : | : : | : : I : I I : : ! : : : : : I 
906 NPTTVKMKSSVFDEDKTFVA- - -ETLEMGEWEKDTTELMEKLEVTKREKLELSQRL3DL 962 



Qy 

Db 

Qy 

Db 



Qy 



321 



EKNYLIKQIRDLKK 3 34 



Db 



963 SEQLKQKHGE I S FLNEEVKSLKQ 985 



RESULT 14 




; sequence ^/^j^, — 
. patent No. US20020048763A1 
; GENERAL INFORMATION: 



APPLICANT: Penn, Sharron G. 
■ APPLICANT: Rank, David R. 



APPLICANT: Hanzel , David K. 




INVENTION: GENE EXPRESSION ANALYSIS BY MICROARRAY 



FILE REFERENCE: Aeomica-X-1 

CURRENT APPLICATION NUMBER: US/09/864 , 761 
CURRENT FILING DATE: 2001-05-23 
PRIOR APPLICATION NUMBER: US 60/180,312 
: PRIOR FILING DATE: 2000-02-04 
: PRIOR APPLICATION NUMBER: US 60/2 07,456 
; PRIOR FILING DATE: 2000-05-26 
' PRIOR APPLICATION NUMBER: US 09/632,366 

• PRI OR FILING DATE: 2 000-08-03 

' TRIOR APPLICATION NUMBER: GB 24263.6 
: I'lOR FILING DATE: 2000-10-04 

■ PRIOR APPLICATION NUMBER : US 60/236,359 
PRIOR FILING DATE: 2000-09-27 

; PRIOR APPLICATION NUMBER: PCT/USOl/00666 

• . PRIOR FILING DATE: 2 001-01-30 

• PRIOR APPLICATION NUMBER: PCT/USOl/00667 

■ PRIOR FILING DATE: 2001-01-30 

; PRIOR APPLICATION NUMBER: PCT/US0 1/00664 

■ PRIOR FILING DATE: 2001-01-30 

; PRIOR APPLICATION NUMBER : PCT/USOl/00669 

' PRIOR FILING DATE: 2001-01-30 

: PRIOR APPLICATION NUMBER: PCT/USOl/ 0 0665 

- PRIOR FILING DATE: 2001-01-30 

; PRIOR APPLICATION NUMBER: PCT/USOl/ 0066 8 

' ' PRIOR FILING DATE: 2001-01-30 

PRIOR APPLICATION NUMBER: PCT/USOl/ 00663 
' ^'RIOR FILING DATE: 2001-01-30 
\ PRIOR APPLICATION NUMBER : PCT/USOl/00662 
' PRIOR FILING DATE: 2001-01-30 
' -PRIOR APPLICATION NUMBER: PCT/USOl/ 00-6 fa 1 

• PRIOR FILING DATE: 2001-01-30 

: PRIOR APPLICATION NUMBER: PCT/USOl/ 00670 

■ PRIOR FILING DATE: 2C01-01-30 

: PRIOR APPLICATION NUMBER: US 60/234,687 

- PRIOR FILING DATE: 2000-09-21 

: PRIOR APPLICATION NUMBER: US 0 9/608,408 

■ PRIOR FILING DATE: 2000-06-30 

• PRIOR APPLICATION NUMBER: US 09/774,203 

- PRIOR FILING DATE: 2001-01-29 



NUMBER OF SEQ ID NOS : 49117 



SOFTWARE: Annomax Sequence Listing Engine vers. 1.1 
SEQ ID NO 47959 
LENGTH: 660 
TYPE : PRT 

ORGANISM: Homo sapiens 
FEATURE: 

OTHER INFORMATION: MAP TO AJ010770.1 

OTHER INFORMATION : EXPRESSED IN LUNG, SIGNAL = 2 
OTHER INFORMATION : EXPRESSED IN BT474, SIGNAL =1.1 
OTHER INFORMATION : SWISSPROT HIT: Q99323, EVALUE 3.00e-17 
ShER INFORMATION: EST_HUMAN HIT: AU132932.1, EVALUE 1.00e-105 
US-09-864-761-47959 

Match 7.3%; Score 125; DB 9; Length 660; 

Local Similarity 2CK5% ; ^ed^ 0^011 ; Indels 10 4; Gaps 13; 
Matches 75; Conservative 68; Mismaccnes xx , 

Qy . 18 VKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPTEAVAQLAQELYSSG 77 

Db 342 IEKLKDNLGIHYKQ- -QIDGLQNEMSQKIETMQ- -= FEKDNLITKQNQLILE- 388 



78 LLVTLIADLQ- -LIDFEGKKDVTQIFNNILRRQI GTRSPTVEYISAHPHI 125 

339 _ _ isKLKDLQQSLWSKSEEMTLQI - -NELQKEIEILRQEEKEKGTLEQEVQELQLKTEL 444- 
12 6" LFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDIASDAFA 185 

'.II I : : I • I i 1 

: 1 . NDLQEKFAQLEAEN-STLKDEKK 4 75 

44 5 LEKQMKEKE 

i ,p FKDLLTRH KVLVADFLE-QNYDTIFEDYEKLLQSENYVTKRQSLKLLGELIL 23T 

136 FKDLLTRH . . . I : : : : |: : : | ::| I I 1 = 1 "! I : 

47S kEDMLKIHTPVSQEERLIFLDSIKSKSKDSVWEKEIEILIEENEDLKQQCIQLNEEIEK 535 

visKPENLKLMMNLLRD 2 64 

i i j I II : I -II 

536 QfUSTTFS^ 595 
265 KSPNIQFEA- -FHVFKVFVASPHKTQPIV^ 322 
59 6 NPTTVKMKSSvLdItUI- - -EiLEMciwEKDTTELMEKLEVTKREKLELSQRLSDL 652 

Qy 32 3 NYLIKQ 32 8 

D b 653 SEQLKQ 65 8 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 23 8 DRHNFAIMTK 

Db 

Qy 

Db 



RESULT 15 
US-10-023-634-18 

Sequence 18, Application US/10023634 
Publication No. US20030236389A1 
GENERAL INFORMATION: 
APPLICANT: Shimkets , Richard A 
APPLICANT: Colman, Steven D 
APPLICANT: Spytek, Kimberly A 
APPLICANT: Ballinger, Robert A 
APPLICANT: Guo, Xiaoj ia 
APPLICANT: Tchernev, Velizar T 



APPLICANT: Shenoy, Suresh G 

APPLICANT: Li, Li 

APPLICANT: Ellerman, Karen 

APPLICANT: Zerhusen, Bryan D 

APPLICANT: Patturajan, Meera 

APPLICANT: Casman, Stacie J 

APPLICANT : Boldog, Ferenc 

A OD LICANT : Gusev, Vladimir Y 

APPLICANT: Burgess, Catherine E 

APPLICANT: Edinger, Shlomit R 

APPLICANT: Gangolli, Esha A 

APPLICANT: Malyankar, Uriel M 

APPLICANT: Gunther, Erik 

APPLICANT: Smithson, Glennda 

APPLICANT: Millet, Isabelle 

S^T F T i»vS«0» h 'p™t"^, Polynucleotide. Encoding Then, and Method, o £ 
TITLE OF INVENTION: Using the Same 



FILE REFERENCE: 21402-221 

CURRENT APPLICATION NUMBER: US/10/023 , 634 



CURRENT FILING DATE: 2002-06-28 
PRIOR APPLICATION NUMBER: 60/256,025 
PRIOR FILING DATE: 2000-1.2-15 

PRIOR APPLICATION NUMBER : 60/265,163 ^ 
v.- TOR FILING DATE: 2001-01-30 
,r iI0 R APPLICATION NUMBER: 60/272,929 
VRIOR FILING DATE: 2001-03-02 
PVIOR APPLICATION NUMBER: 60/274,864 
PPTOR FILING DATE: 2001-03-09 
PRIOR APPLICATION NUMBER: 60/27 6,688 
Tr*TOR FILING DATE: 2001-03-16 
i ; 7>I0R APPLICATION NUMBER: 60/277,880 

ioiOR FILING DATE: 2001-03-22 " % 

PRIOR APPLICATION NUMBER: 60/286,409 . - 

oRTOR FILING DATE: 2001-04-25 
PRIOR APPLICATION NUMBER: 60/309,246 
PRIOR FILING DATE: 2001-07-31 

PRIOR APPLICATION NUMBER: 60/315,600 - . 

PRIOR FILING DATE: 2001-08-29 
NUMBER OF SEQ ID NOS : 132 
SOFTWARE: Patentin Ver. 2.1 
SEQ ID NO 18 
LENGTH: 7 09 
TYPE : PRT 

ORGANISM: Hcmo sapiens 
US- .10-023-634-18 

M ^ h 6.8%; Score 116.5; DB 12; Length 709; 

Query Match ' n7 ^. 

Best Local Similarity » ' " ''"^ZtoLl md.ls 115; Gaps 12; 

Matches 79; Conservative 64; msmatcnes 

3 KMPLFSKSHKNPAEIVKILKDH^ILEKQDKlcrDKASEEVSKLQflMl^IIi^TNSKBPP 62 
„, KLQWQRSLESSQGKIAQtiEGKLVSIEKE- -KIDEKS-ETEKLLEYIEEISCASDQVEKY 23 = 



Db 
QY 



63 TEAVAQLAQELYSSGLLVTLIADLQIiIDFEGKKDVTQIFNNILRRQIGTRSPTV 122 



236 KLDIAQLEENL KEKNDEIIiSLKQSLEENIVILSKQVE 272 

123 PHILKMLLKGYEAP01ALRCGIMLRECIRH - -EPLAKI ILFSNQFRD 167 

: : : I : : : | | : I : : ' I : 

273 DLNVKCQLLEKEKEDHVNRNREHNENLNAEMQNLKQKFILEQQERE 318 

DAFATFKDLLTRHKVLVADFLEQNYDT I FEDYEKLL 217 



168 FFKYVELSTFDI AS - ^ . [ 

319 KLQQKELlDSLLQQEKELSSSLHQKLCSFQEEMVKEKNLFEEEiKQTLDELDKLQQKEE 378 

218 QSENYV TKRQS LKLLGEL I LDRHNFA IMTKY 248 

379 UERLiKQLEEEAKSRAEEiiiiEF^KLKGKEAELEKSSAAHTQATLLLQEKYDSMVQSLE 438 
XSKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILL 2 95 



24 9 



4 3 9 cvTAQFEGYKALTASEliDiilENSsiQElAAKAGKNiEDiQHQILATESSNQEYVRMLL 4 9 8 



-, 96 jcfiQpK LIEFLSSFQKERTD-DEQFADEKNYIjIKQIRD 331 

II • : I | I := II I :: M=l 

*99 DLQTKSALKETEIKEITVSFLQKITDLQNQLKQQEEDFRKQLED 542 



h compiled: January 7, 2004, 16:52:26 



GenCore version 5.1-6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 

Run on- January 7, 2004, 16:44:17 ; Search time 41 Seconds 

* un on - (without alignments) 

2121.067 Million cell updates/sec 

Xjtle: US-10-088-872-2 

S^cer Jl "* ; i 7 MKKMPLFSKSHKNPAElVKI.. FADE KNYL I KQ I RDLKKTAP 337 

Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 

Searched: 830525 seqs, 258052604 residues 

Total number of hits satisfying chosen parameters: 830525 

Minimum DB seq length: 0 

Maximum PB seq length: 2000000000 

'"■.-jfit -processing: Minimum Match 0% 

Maximum Match 10 0% 
Listing first 45 summaries 



SPTREMBL_2 3 :* 
1: sp_archea:* 
2: sp_bacteria : * 
3 : sp_f ungi : * 
4 : sp_human : * 
5 : sp_J.nvertebrate : * 
6: spjnammal : * 

7 : sp__mhc : * 

8 : sp_organelle : * 

9 : sp_phage : * 
10: sp__plant:* 
11: sp_rodent : * 
12 : sp_virus : * 

13 : sp_vertebrate : * 

14 : sp_unclassif ied: * 

15 : sp_rvirus : * 

16 : sp_bacteriap : * 

17 : sp__ar cheap : * 



oretf No is the number of results predicted by chance to have a 

heater than or equal to the score of the result bexng printed., 
and "is derived by analysis of the total score distribution. 

SUMMARIES 

% 

Result Query Description 

No. Score Match Length DB ID * 



1 


1634 


98 . 


8 


337 


11 


Q8BG52 


Q8bg52 mus musculu 


2 


1669 


97 


9 


334 


11 


Q91WB8 


Q91wb8 mus musculu 


3 


1663 


97 


6 


334 


11 


O91YL0 


Q91yl0 mus musculu 


4 


1462 


85 


8 


289 


4 


Q96FG1 


Q96fgl homo sapien 


tr 


1381 


81 


0 


341 


11 


Q8VDZ8 


Q8vdz8 mus musculu 


6 


1066.5 


62 


6 


636 


5 


Q21643 


Q2164 3 caenorhabdi 


7 


875 


51 


3 


205 


11 


Q8K312 


Q8k312 mus musculu 


S 


709.5 


41 


6 


333 


10 


Q8H5L9 ■ 


Q8h5l9 oryza sativ 


3 


671.5 


39 


4 


345 


10 


Q8L9L9 


Q81919 arabidopsis 


10 


590 


34 


6 


322 


10 


Q8LIF3 


Q81if3 oryza sativ 


11 


435 


25 


5 


103 


11 


Q8K038 


Q8k038 mus musculu 


12 


13 4 . 5 


7 


9 


677 


16 


023188 


02 5188 helicobacte 


13 


128 


7 


5 


430 


16 


026049 


026049 helicobacte 


14 


12 3 . 5 


7 


2 


1285 


16 


Q9WXU3 


Q9wxu3 thermotoga - 


15 


12 0 


7 


0 


1175 


17 


Q5 8 914 


Q58914 methanococc 


16 


119.5 


7 


0 


1056 


16 


Q8REF7 


Q8ref7 fusobacceri 


17 


119 


7 


0 


1111 


5 


Q9VGE4 


Q9vge4 drosophila 


18 


118 . 5 


7 


0 


554 


5 


Q8IN90 


Q8in90 drosophila 


19 


118.5 


7 


0 


670 


5 


Q9VEC7 


Q9vec7 drosophila. 


2 0 


118.5 


7 


0 


670 


5 


Q9NFM7 


Q9nfm7 drosophila 


21 


117 


6 


9 


808 


5 


Q8T133 


Q8tl33 dictyosteli 


22 


117 


6 


9 


808 


5 


Q9GSH4 


Q9gsh4 dictyosteli 


2 3 


116.5 


6 


8 


1135 


5 


Q9NJQ4 


Q9njq4 Paramecium: \* 


24 


lib 


6 


8 


911 


16 


Q8EUI7 


Q8eui7 mycoplasma , 


2 : j 


216 


6 


3 


1389 


5 


Q8I293 


Q8i2 93 Plasmodium ' 


2 L) 


21- . 5 


6 


8 


1111 


5. 


Q9U0K5 


09upk5 Plasmodium') 


2 7 


... :. ' ; - 5 


6 


3 


1946 


5 


097291 


0972 91 Plasmodium! \ :l 


2 3 


115 


6 


•7 


473 


11 


Q8R43 6 


Q8r4 3 6 mus musculu" 


2 9 


115 


6 


7 


2518 


5 


Q8IEH2 


Q8ieh2 plasmodium 


2 0 


111 ; 5 


6 


7 


1941 




Q8IAK6 


Q8iak6 Plasmodium j 


2 1 • 


114 


6 


7 


743 


13 


Q9YGE7 


Q9yge7 oncorhynchu" 


-> 


112 . 5 


6 


7 


833 


4 


Q9UF54 


Q9uf 54 homo sapien . 


2 2 


113 . 5 


6 


7 


951 


5 


Q9VEC6 


Q9vec6 drosophila: 


'- : /i 


113 . 5 


6 


■" 


984 


5 


Q8IN8 9 


Q8in89 drosophila ; ■ 


25 


113 


6 


6 


474 


5 1 


097233 


0972 3 3 Plasmodium 


3 o 


113 


6 


6 


647 


11 


Q8CA10 


Q8cal0 mus musculu, 


37 


111 .5 


6 


5 


1925 


5 


Q8I2D1 


Q8i2dl plasmodium, . , 


- ' .2 


121.5 


6 


5 


2429 


5 


Q9VFB1 


Q9vfbl drosophilai ' 


3 9 


111.5 


6 


5 


2771 


5 


Q26216 


Q26216 plasmodium • 


4 0 


111 


6 


5 


2166 


16 


051465 


051465 borrelia bu 


41 


111 


6 


5 


2819 


16 


Q98QP8 


Q98qp8 mycoplasma ': 


4 2 


110 


5 


5 


461 


5 


077390 


0773 90 plasmodium 


43 . 


110 


6 


5 


1183 


2 


086064 


086064 helicobacte 


44 


110 


6 


. 5 


1758 


5 


Q811K5 


Q8ilk5 plasmodium 


45 


10 91 5 


6 


4 


457 


16 


Q9PQM0 


Q9pqm0 ureaplasma- - 



ALIGNMENTS 



RESULT 1 
Q8BG52 



ID 
AC 
DT 
DT 
DT 



OCBG^2 

Q3BG52; 

01 --MAR-2 063 

Ol-MAR-2003. 

01-MAR-20 03 



PRELIMINARY; 

(TrEMBLrel. 23, 
(TrEMBLrel. 23, 
(TrEMBLrel . 23, 



?RT; 



337 AA. 



Created) 

Last sequence update) 
Last annotation update) 



DE M02 5-like protein homolog. 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_ TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; TISSUE=Eye, Pituitary, and Testis; 

RX MEDLINE=22354683; PubMed-12466851 ; 

RA The FANTOM Consortium, 

RA the RIKEN Genome Exploration Research Group Phase I & II Team; 

RT "/analysis of the mouse transcriptome based on functional annotation of 

RT 60,770 full-length cDNAs . ? ' ; 

RL Nature 420:563-573(2002). 

DR EMBL; AK030474; BAC26978.1; -. 

DR EMBL; AK053642; BAC35457.1; -- 

DR EMBL; AK076758 ; BAC36470 . 1 ; 

SQ SEQUENCE 337 AA; 3 9105 MW; 



C62B5B58095A98C8 CRC64 ; 



Query Match 98.8%; 
Best Local Similarity 98.5%; 
Matches 332; Conservative- 



Score 16 84; D3 11 
Pred. No. l.le-110 
2; Mismatches 3 



Length 33 7; 

Indels 0; Gaps , 0;. 



Qy 

Qy 
Db 



Qy 
Db 

Db 
Qy 

Db 



' MKKMPLFSKSHKNPAEIVKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNE'KE 60 

~ 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 M 1 1 1 1 1 1 M I M 1 1 1 i 1 1 1 i i M 1 1 1 1 1 1 1 M 1 1 M h M 

!. MKKMPLFS KSHKNPAE I VKI LKDNLAI LEKQDKKTDKASEE VS KS LQAMKE I LCGTNDKE .0 
PPTEAVAOLAQELYS SGLL VTL I ADLQL IDFEGKKD VTQ I FNNI LRRQ IGTRS PTVEYI S 120 

1 1 1 1 1 1 i 1 ; 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 MIIII.I 

o 1 P PTEAVAQLAQELYSSGLLVTLI ADLQL IDFEGKKD V ^ 0; 

, : ,n AHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDIA 180 

^ , 1 1 1 1 ! I M 1 II I M 1 1 1 1 M M 1 1 1 i i M I ! I ! M II M ! ! 1 1 1 1 1 1 1 1 1 U It . 

•121 sHPHILFMTjLKGYEAPQIALRCGIMLRECIRHEPLA . . 

181 SDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSLKLLGELILDRH 240 

Mill Mill IIIMIMIIIIIIMMIIMIIIM I Mill III Mill Mill Mil 

181 SDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSLKLLGELILDRil 240 
1 NF£ IMTKY I S KPENLKLMMNLLRDKS PNI QFEAFHVF KVF VAS PHKTQPI VE I LLKNQ PK 300: 

\\ \ Ml 1 1 1 1 Ml M II 1 1 II I M 1 1 1 II 1 1 1 1 1 1 1 1 1 II 1 1 M I II I II Ml II 1 1 M 

241 NFT IMTKYI S KPENLKLMMNLLRDKS PNIQFE AFHVFKVFVAS PHKTQP I VE ILLKNQPK 300 

rOl LIEFLSSFQKERTDDEQFADEKNYLIKQIRDLKKTAP 3 37 

I M II I M M I I I I I I I M II I I I II I I I I I I I I M 
3 01 LIEFLSSFQKERTDDEQFADEKNYLIKQIRDLKKAAP 337 



RESULT 2 
Q91WB8 

ID Q91WBS PRELIMINARY; PRT; 334 AA . 

AC 091W32; 

DT 01-DEC-2001 (TrEMBLrel . 19, Created) 

DT 01-DEC-2001 (TrEMBLrel. 19, Last sequence update) 

DT ' 01-MAR-2003 (TrEMBLrel. 23, Last annotation update) 

DE Similar to hypothetical protein FLJ12577 (M025-like protein 

DE homolog) . 



OS Mas musculus (Mouse) . 

GC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae ; Murinae; Mus . 

OX NCBI_TaxID=10 090; 

RN [1] 

RP SEQUENCE FROM N. A. 

RC TISSUE-Salivary gland; 

RA Strausberg R.; 

RL Submitted (OCT-2001) to the EMBL/GenBank/DDBJ databases. 

RN [2] 

RP SEQUENCE FROM N.A. 

RC STRAIN- C5 7BL/ 6 J; TISSUE=Testis ; 

RX MEDLINE=223 546 8 3 ; PubMed- 124 66 85 1 ; 

RA The FANTOM Consortium, 

RA the RIKEN Genome Exploration Research Group Phase I 5? II Team; 

RT "Analysis of the mouse transcriptome based on functional annotation of 

RT 60,770 full-length cDNAs . " ; 

RL Nature 420:563-573(2 002). 

DR EMBL; BC01S128; AAH16128.1; -. 

DR FMBL ; AK0 76 8 67; BAC36513.1; 

DR InterPro; IPR0 04 3 92; Mo2 5. ' 

DR Pfam; PF032 04; Mo25; 1. 

KW Hypothetical protein. 

SQ SEQUENCE 334 AA; 38718 MW; 822F04A87FB4EB6F CRC64 ; ■: .* . 

Quarry Match 97.9%; Score 1669; DB 11; Length 334; ... . 

Befit Local Similarity 98.5%; Pred. No. 1.3e-109; 
.Marches 32 9; Conservative 2; Mismatches ' 3; Indels 0; Gaps'-.-. 0; 

'C-.- - 4 M PI :FS KSHKNPAEI VKI LKDNLAXLEKQDKKTDKASEEVS KS LQAMKE I LCGTNEKEPPT 63 

M < M MM 1 1 II 1 1 M 1 1 1 M I MM 1 1 1 1 M M I II M II M M M M I II MUM I 

L,> 1 MPLFSKSHKNPAEIVKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNDKEPPT 6 0 

v-./ o4 EAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYISAHP 123 . 

M < 1 1 1 ! - i I ; i I ! 1 1 1 1 1 1 i 1 1 ! 1 1 M M I M M ! I MINIMI 

Db ol SAVAQLA'QELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRCPTVEYISSHP 120; 

Ov 124 HILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDIASDA 183 

! M 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i ! 1 1 1 1 1 ! 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 M 1 1 1 

•Db 12.1 HILFMLLKGYEAPQIALRCGIMLRFCIRHEPLAKI ILFSNQFRDFFKYVELSTFDIASDA 180 

Qy -i S £ FATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSLKLLGELILDRHNFA 2 4 3 

IM i I Mi 1 1 1 II 1 1 1 ! 1 1 1 M I M 1 1 1 1 1 M 1 1 ' 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M! 1 1 1 ! I : 

Db 181 FATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSLKLLGELILDRHNFT 240 

QY 5!4vl IMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQPKLIE 3 03 

M M MM 1 1 Ml M ! ! 1 1 1 M I M 1 1 1 II M MM I M 1 1 1 M 1 1 1 M I IM IM I M I 

Db 241 IMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQPKLIE 300 

Qv 3 04 FLS S FQKERTDDEQFADEKNYL I KQ I RDLKKTAP 3 37 

M i I M M M M M I ! M M M M M M M 1 1 1 

Db 3 01 FLSSFQKERTDDEQFADEKNYLIKQIRDLKKAAP 334 

RESULT 3 
Q91YL0 

ID Q91YL0 PRELIMINARY; PRT; 334 AA. 



AC 
DT 
DT 
DT 
DE 
OS 
OC 
OC 
OX 
RN 
RP 
RA 
RL 
DR 
DR 
DP. 
KW 
SQ 



Q91YL0; 

01-DEC-2001 (TrEMBLrel . 19, Created) 
01-DEC-2001 (TrEMBLrel . 19, Last sequence update) 
01-JUN-2002 (TrEMBLrel. 21, Last annotation update) 
Similar to hypothetical protein FLJ12577. 
Mus musculus (Mouse) . 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
NCBI_TaxID=10090 ; 
[1] 

SEQUENCE FROM N.A. 
Strausberg R . ; 

Submitted (OCT-2001) co the EMBL/GenBank/DDBJ databases. 
EMBL; BC015546; AAH16546.1; 
InterPro; IPR004892; Mo2 5. 
Pfam; PF03204; Mo25; 1. 
Hypothetical protein. 

SEQUENCE 334 AA: 38761 MW; 5F9765360653750E CRC64 ; 

Query Match 37.6%; Score 1663; DB 11; Length 334; , 

Bes<- Local Similarity 98.2%; Pred. No. 3.3e-109; 

Matches 328; Conservative 2; Mismatches 4; Indels 0; Gaps 0; 

,i M PL FS KSHKNPAE I VK I LKDN LAI LE KQD'KKTDKAS EEVS KS LQAMKE I LCGTN EKEPPT 63 

ill|!|i||||!IIIIIMII!IMIIIIIIIII!Mlillll!MiiMII!IHIMI 

I MPLFSKSHraPAEIVKILKD 60 
•I £ AVAQL AQE LYS SGLLVTL I ADLQL IDFEGKKDVTQ I FNN I LRRQT GTRS PTVEY I S AHP 12 3: 

" i i i Mill 1 1 1 1 1 1 1 M 1 1 II i I M 1 1 1 M I M ! i 1 1 ! i I II Ml 1 1 1 1 1 1 1 1 MM I 

:.il EAVAQLAQ^L YS £GLL VTLI ADLQLIDFEGKKDVTQ I FKFNI LRRQ IGTRCPTV;EYI S SHP 120, 
i ->,' H' r LFMLLKGYEAJPQ I ALRCG IMLRECIRHE PLAKI I LFSNQFRDFFKYVELSTFP I ASDA 18 3 - 

: TtVT^TTTTi i i ii i 'm i i i i i i n n i i i i i i i i i i i 1 1 1 n 1 1 1 n 1 1 1 1 1 1 i-i i- ■ 

;, T21 H I LFM TjL KGYEAPQ I ALRCG I MLREC I RHEPLAKI I LF SNQFRD FFKYVEL3 TFD IAS DA 180 

*3* FATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSLKLLGELILD^NFA 243. 

MIIIIMIIIIIIIIMIIIIIIMIIIIMIIIMIIIIIIIIIM I II I II II II 

^1 FAT^roLLTRH^LV 240 



Qy 

Qy 
Qy 

Db 



2*4 I MTKY I S KPENL KLMMNLLRDKS PNI QFE AFHVFKVF VAS PHKTQP I VE I LL KNQPKL I E 303 ■ 

IIMIIIMIMMMMIIIIMIIMIIIIIIIIIIIMIIIIIIIIIIIIIIIIIM 

M 1 I MTKY I S KPENLKLMMNLIiRDKSPNIQFEAFHVFKVFVAS PHKTQP I VE I LLKNQPKL IE 3 00. 
■■:- 0 ,y FLSSFQKERTDDEQFADEKNYLIKQIRDLKKTAP 337 

' ■ i 1 1 i 1 1 i 1 1 1 1 1 ! 1 1 1 II 1 1 1 1 i 1 1 1 M 1 1 1 M 

201 FLSS FQKERTDDEQFADEKNYLI KQ I RDLKKAAP 3 34 



RESULT <: 
Q96FGI 



ID 
AC 
DT 
DT 
DT 
DE 
OS 



PRELIMINARY; 



PRT ; 



Q^fii'Cl 

OI-DEC-2001 (TrEMBLrel. 19, Created) 
01-DEC-2001 (TrEMBLrel. 19, 
Ol-OCT-2002 (TrEMBLrel- 22, 
Hypothetical protein. 
Heme sapiens (Human) . 



289 AA. 



Last sequence update) 
Last annotation update) 



OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCB I _TaxID= 9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Placenta ; 

RA Strausberg R. ; 

RL Submitted (JUL-2001) to the EMBL/ GenBank/DDB J databases. 

DR EMBL; BC010993; AAH10993.1; -. 

DR InterPro; IPR004892; Mo25. 

DR Pfam; PF03204; Mo25; 1. 

KW Hypothetical protein. 

SQ SEQUENCE 289 AA; 33738 MW; 



F57B9EFCF6ABF2D7 CRC64; 



Query Match 8 5.8%; 

Best Local Similarity 99.7%; 
Matches 288; Conservative 



Score 14 62; DB 4; Length 2 89; 
Pred. No. 3.8e-95; 
0; Mismatches 1; Indels 0; 



Gaps 



Qy 


4 9- 


Db 


1 


Qy 


109 




61 


Qy 


169 


Db 


121 


Qy 




Db 




Qy 


2 89 


Db 


241 



/ MKEILCGTNEKEPPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQ 108- 

i 1 1 1 1 M 1 1 1 1 ! 1 1 IN Ml I II 1.1 III I Nil 1 1 IMMII Ml III NUN N I 

1 MKEILCGTNEKEPPTEAVAQLAQELYSSGLLVTLIADLQLIDFEEKKDVTQIFNNILRRQ 60 

IGTRSPTVEYISAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDF 168 

T M 1 1 1 1 i I i 1 1 1 1 M 1 1 M 1 1 i 1 1 ! 1 1 i I M I M I ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i i ! 1 1 1 1 1 1 

1GTR5PTVEYISAHPH1LFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNOFRDF 120. 



i I! Mill Ml II i I i 1 1 Mi I M I ! II 1 1 1 1 1 1 M I III I II 1 1 ! 1 1 ! I Ml 1 1 1 1 

FKYVELSTFDIASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQS 180, 



JJXN.J-JJ-IVJJ2JJJ-L -LiX^iV-AliU X. x J- xv J. J. — 

M I H 1 1 i 1 1 1 1 1 M ! M II i i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 M 1 1 1 1 1 1 1 ! 



I IMIIIII MMI II MM IIIIMIIII Ml II! IIIIMII II 



Created) 

Last sequence update) 
Last annotation update) 



ilESULT 5 
Q3VDZ8 

ID 08VDZ8 PRELIMINARY; .. PRT ; 341 AA. 

AC Q8VDZ8; 

DT 01-MAR-2002 (TrEMBLrel. 20, 

D?t 01-MAR-2002 (TrEMBLrel. 20, 

DT Oi-OCT-2002 (TrEMBLrel. 22, 

BE M02 5 protein. 

GN CAB3 9. 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

CC Mammalia; Eutheria; Rodentia; Sciurognathi , Muridae; Murinae; Mus. 

OX SfCBI _TaxID=10090 ; 

RN [1] 

RP SEQUENCE FROM N.A. 

RA Strausberg R. ; 

RL Submitted (DEC-2001) to the EMBL/ GenBank/DDB J databases. 

DR EMBL; BC02 0041; AAH2 0 041-1; 



DR MGD; MGI: 107438; Cab39. 

DR InterPro; IPR004892; Mo25. 

DR Pfam; PF032 04; Mo25; 1. 

SQ SEQUENCE 341 AA; 39843 MW; E7FECA52 9D6FE811 CRC64 ; 

Query Match 81.0%; Score 13 81; DB 11; Length 341; 

Best Local Similarity 81.0%; Pred. No. 2.3e-89; 

Matches 273; Conservative 31; Mismatches 29; Indels 4; Gaps 2; 

0y 4 MPL-FSKSHKNPAEIVKILKDNLAILEKQ DKKTDKASEEVSKSLQAMKEILCGTNEK 59 

|| | Mihlhlll lh::|:|IM III Mhllllhl MINI INN 
Db i MPFPFGKSHKSPADIVKNLKESMAVLEKQDISDKKAEKATEEVSKNLVAMKEILYGTNEK 60 

Qv SQ EPPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYI 119 

II IMMMMIIhlMI IhlMIMMMIIM IMIIIIIIMIIhllllll 

D b 61 EPQTEAVAQLAQELYN3GLLGTLVADLQLIDFEGKKDVAQ1FNNILRRQIGTRTPTVEYI 120 

Qy t?0 SAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDI 179 

MMIMIIIhhill Mill M Mill II MMhl !l 1 1 hi I hi MM 

Db 121 CTQQNILFMLLKGYESPEIALNCGIMLRECIRHEPLAKHLWSEQFYDFFRYVEMSTFDI 180 

Qy 18 0 ASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSLKLLGELILDR 239 

M 1 1 II 1 1 1 II II 1 1 h I hlllhM I M II M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ( I J = 1 1 1 

ub ASDAFATFKDLLTRHKLLSAEFLEQHYDRFFSEYEKLLHSENYVTKRQSLKLLGELLLDR 24 0 - . 

-.-AO WMFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQP 299. 

! I h! 1 1 1 M II M I II M M II 1 1 1 ! II i 1 1 1 I! I M 1 1 1 h h M M h M I II 1 1 

lJb y t , y i HHFTIMTKYISKPENIjI<IjMMNLLRDKSRNIQFEAFHVFKVFVAIJPNKTQP1:L 3 0.0:'. 

oy )oo klieflssfokertddeqfadeknylikoirdlkkta 33 6 .:- 

! 1 1 1 1 1 1 M : I h 1 1 Ml 1 1 1 1 - 1 M 1 1 1 h I 

^ 301 KLIEFLSKFQNDP.TEDEQFNDEKTYLVKOIRDLKRAA 337 



RESULT 6 

Q21&43 • » : 

ID Q21643 PRELIMINARY; = PRT; 636 AA. 

AC 021643; 

DT 01-NOV-1996 (TrEMBLrel . 01, Created) 

DT 01-OCT-2001 (TrEMBLrel . 18, Last sequence update) 

DT 01-MAR-2003 (TrEMBLrel. 23, Last annotation update) 

DE Hypothetical 72.3 kDa protein. 

GN R02E12 .2 . 

OS Caenbrhabditis elegans . 

OC Eukaryota; Metazoa; Nematoda; Chromadorea; Rhabditida; Rhabditoidea; 

OC Rhabditidae; Peloderinae; Caenorhabditis . 

OX NCBI_TaxID=623 9; 

RN. [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Bristol N2 ; 

RX MEDLINE=99069613 ; PubMed=98519 16 ; 

RA. None ; 

RT "Genome sequence of the nematode C. elegans : a platform for 

RT investigating biology. The C. elegans Sequencing Consortium."; 

RL Science 282:2012-2018(1998). 

RN [2] 

RP SEQUENCE FROM N.A. 



RC STRAIN=Bristol N2 ; 

RA Leimbach D,; 

RT "The sequence of C. elegans cosmid R02E12."; 

RL Submitted (APR-1996) to the EMBL/ GenBank/DDBJ databases. 

RN [3] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Bristol N2 ; 

RA Waterston R. ; 

RT "Direct Submission."; 

RL Submitted (JUL-2001) to the EMBL /GenBank/DDBJ databases. 

DR EMBL; U53337; AAA96186.2; -. 

DR WormPep; R02E12.2; CE2 8410. 

DR Inter Pro; IPR004 8 92; Mo2 5. 

DR Pfam; PF032 04; Mo25; 1. 

KW Hypothetical protein. 

SQ SEQUENCE 636 AA; 72282 MW ; 85D5853E9F0E3193 CRC64 ; 

Query. Match 62.6%; Score 1066.5; DB 5; Length 636; 

Best Local Similarity 60.4%; Pred. No. 6.3e-67; 

Matches 212; Conservative 53; Mismatches 69; Inde'ls 17; Gaps 3 

Qy 2 KKMP-LFSKSHKNPAEIVKILKDNLAILEK QDKKTDKASEEVSKSLQ 4 7 

I || || ||||:|h:|l l= = I 11=1 III til = MM = =. 

Db 25 8 KVMPLLFGKSHKSPADWKTLREVLTILDKJjPPPKLDKDGNIQSDKKYDKALDEVSKNVA 317 

4 3 AMKE I LCGTN E KE P PTE AVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVT^IFNNI 104 

: 1 : | I I * I I I I I \ I I = 1 - = I I I . I : II I I I I 11111 = 

;;:-> .'IS MiKSFlYGNDSAEPSSEHWQVAQLAQEVYNANILPMLJKMLPKFEFECKKDVGQIFNNL 377 

\ f) j LRRQIGTRSPTVEYISAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQ 164 

MMIMMMIIh ! i II M = M I III ihllil 111 = lllllhh 

. ; v 3 LRRQ IGTRS PTVE YLGARPE IL I QLVQGYS VPD I ALTCGLMLRES IRHDHLAKI ILYSDV 4 3 7 
n v vp-s FRDFFKYVEL3TFDIASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVT 2 24 

I li ih 1 1 1 = I f 1 1 = 1 1 1 = I MM = = 1 = 11= MM I 1= M 1 = 1111 

D b 438 FYTFFLYVQSEVFDISSDAFSTFKELTTRHKAIIAEFLDSNYDTFFAQYQNLLNSKNYVT 497 

£-/ ^25 KRQS LKLLGEL ILDRHNFAI MTKYI SKPENLKLMMNLLRDKS PNI QFEAFHVFKV FVAS P 284 

: I i i | | | ! I I : I I I I I ! MMM =11=111 MMM I I I = I I I I 1 I I I I I h I 
Db- 498 RRQSLKLLGELLLDRHMFNTMTKYISNPDNLRLMMELLRDKSRNIQYEAFHVFKVFVANP 557 

Qy :»35 HKTQPIVE1LLKNQPKLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLKKT 335 

: | Ml =M =1= 11 = 1111 I =1 I MM I III 111111 = = = 1 : 
Db 356 NKPKPISDILNRNREKLVEFLSEFHNDRTDDEQFNDEKAYLI KQIQEMKSS 608 



.RESULT 7 
Q8K312 

ID Q8K312 PRELIMINARY ; PRT; 205 AA . 

AC Q8K312, 

DT 01-OCT-2002 (TrEMBLrel . 22, Created) 

DT 01-OCT-2002 (TrEMBLrel. 22, Last sequence update) 

DT 01-MAR-2003 (TrEMBLrel. 23, Last annotation update) 

DE Similar to calcium binding protein, 3 9 kDa (Fragment) . 

OS Mus musculus (Mouse) . 

CC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
OC Mammalia; Eutheria; Rodent ia; Sciurognathi ; Muridae; Murinae; Mus . 



OX NCBI_TaxID=10090 ; 

RN [1] 

RP SEQUENCE FROM N . A. 

RA Strausberg R.; 

RL Submitted (MAY-2002) to the EMBL/ GenBank/DDBJ databases. 

DR EMBL; BC029053 ; AAH29053 . 1 ; 

DR InterPro; IPR004892; Mo25 . 

BR Pfam; PF032 04; Mo25; 1. 

FT NON_TER 1 1 

SQ SEQUENCE 205 AA; 24582 MW; 0152 61A02 F80 816 9 CRC64; 

Query Match 51.3%; Score 875; DB 11; Length 205; 

Best" Local Similarity 33.6%; Pred. No. 4.8e-54; 

Matches 168; Conservative 17; Mismatches 16; Inaels 0; Gaps . 0 
nv -i 36 poiALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDIASDAFATFKDLLTRHK 195 

Y ^ |: IIIMIMMMIilhl II llhllhllllllillllllllllllll 

Db ! PEIALNCGIMLRECIRHEPLAKIILWSEQFYDFFRYVEMSTFDIASDAFATFKDLLTRHK 60 

0v 1-96 VLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSLKLLGJ5L1LDRHNFAIMTKYISKPENL 255 

' :| hlllhll I HUH M II M I II Ilhllllil lllillllllll 

Db 6 i LLSAEFLEQHYDRFFSEYEKLLHSENYVTKRQSLKLiiGELLLDRHNFTIMTKYISKPENL 12 0 

o 56 KLMMNLL RDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQPKLIEFLSSFQKERTDD 315 

' 1 1 1 1 1 1 M M ! Ill Mill I II II I hhl I Ill-Ill ill I II I II Ml = l nft 

Dh 121 KLMMNLLRDKSRNIQFEAFHVFKVFVANPNKTQPILD1LLKNQTKL1EFLSKFQNBRTED 180 

'516 EQFADEKNYLIKQIRDLKKTA 336 

III III I hill I llh I 

d:;j 181 KQFNDEKTYLVKQIRDLKRAA 201 



RESULT 8 

0-5H5L9 * ' 

ID 08H5L9 PRELIMINARY; PRT; 3 33 AA. 

AC 08H5L9; 

DT 01-MAR-2003 (TrEMBLrel . 23, Created) 

DT 01-MAR-2003 (TrEMBLrel. 23, Last sequence update) 

DT 01-MAR-2003 (TrEMBLrel. 23, Last annotation update) 

DE Putative M025 protein (CGI-66) . 

GN OJ1060__D03 .13 . 

OS Orvza sativa (japonica cultivar-group) . 

OC Eukaryota; Viridiplantae ; Streptophyta ; Embryophyta; Tracheophyta ; 

OC Spermatophyta; Magnoliophyta ; Liliopsida; Poales; Poaceae; 

OC I^hrhartoideae; Oryzeae; Oryza. 

OX NCB I _Tax I D = 3 9 9 4 7 ; 

RN [11 

RP SEQUENCE FROM N.A. 

RC STRAIN=cv. Nipponbare; 

RA Sasaki T., Matsumoto T . , Yamamoto K. ; 

RT ;, Oryza sativa nipponbare (GA3 ) genomic DNA, chromosome 7, BAC 

. RT clone : OJ1060_D03 . " ; 

RL Submitted (JUL-2001) to the EMBL/ GenBank/DDBJ databases. 

DR EMBL; AP003803; BAC22269.1; -. 

SQ SEQUENCE 333 AA; 38452 MW; CB6FC45E098C2401 CRC64 ; 



Query Match 



41.6%; Score 709.5; DB 10; Length 333; 



Best Local Similarity 44.0%; Pred . No. 3.7e-42; 

Matches 147; Conservative 67; Mismatches 109; Indels 11; Gaps 5; 

b : LFS KSHKNPAEIVKILKDNLAILEKQ DKKTDKASEEVSKSLQAMKEILCGTNEK 59 

W || : ||::|: :: I h I I :: 1=1 I 5 s = : I I I I I 

Db , 4 .. LFKSKPRTPADWRQTRELLIFLDLHSGSRGGDAKREEKMAELSKNIRELKSILYGNGES 63 

rvv 6C , EPPTEAVAQLAQEL YS SGLLVTL I ADLQL IDFEGKKDVTQ I FNN I LRRQI GTRS PTVE Y I 119 

|| Ml Mil : I II I : : | : | I I h I • I : I • - - Ih 

Db 64 EPVTEACVQLTQEFFRENTLRLLIICLPKLNLETRKDATQWANLQRQQVSSKIVASEYL 123 

Ov 12 0 SAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDI 179 

^ |: :| |: || I | I I I I I III i I : : I : I 'I : s I I I : = I I I I 

Db 12 4 EANKDLLDTLI- SYENMDIALHYGSMLRECIRHQSIA- YVLESDHMKKFFDYIQLPNFD I 181 

W -co ASDAFATFKDLLTRHIO/LVADFLEQNYDT I FEDYE - KLLQS ENYVTKRQSLKLLGE LI LD 233 

II ^ 1 1 1 1 1 INI =11! I : : : 1 1 I 114111-1 M-ll 

Db 182 ASDASATFKELLTRHKATVAEFLS KNYDWFFSEFNTRLLSSTNYITKRQAIKFLGDMLLD 241 

0 ,, v S9 RHNFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQ 298' 

~"|| :| :|:| = I I : : I I I I I I I I I I I I I 1 1 I 1 = I 1 = 'I H Ih h 

Db 242 RSNS TVMMR YVS S KDNLM I LMNLLRD S S KN I Q I E AFHVFKLF AANKNKPTE VVN I L VTNR 301 



>59 PKLIEFLSSFQKERTDDEQFADEKNYLIKQIRDL 332 

||: ! : h := I I I I =1 Hhl I 
y,;2 SKLLRFFAGFKIDK- -DEQFEADKEQVIKEISAL 333 



INSULT f > 

0;.r.9L9 A J 

ID Q8L9L9 PRELIMINARY; PRT; 345 AA. 

AC Q8L9L9; 

DT 01-OCT-2002 (TrEMBLrel . 22, Created) 

DT 01-OCT-2002 (TrEMBLrel. 22, Last sequence update) 

DT 01-MAR-2003 (TrEMBLrel . 23, Last annotation update) 

Dii 1 . Hypothetical protein. 

OS Arabidopsis thaliana (Mouse-ear cress) . 

OC Eukaryota; Viridiplantae ; Streptophyta; Smbryophyta; Tracheophyta; 

OC Spermatophyta; Magnoliophyta ; eudicotyledons ; core eudicots; Rosidae; 

CC eurosids II; Brassicales; Brassicaceae ; Arabidopsis. 

OX NCBI_TaxID=3 7 02; 

RN [1] " " ' " 

Rp SEQUENCE FROM N.A. 

PA Haas B . J. , Volfovsky N., Town CD. , Troukhan M., Alexandrov N., 

RA Feldmann K.A. , Flavell R.B., White O., Salzberg S.L.; 

RT "Full-length messenger RNA sequences greatly improve genome 

RT anno tation. " ; 

RL Genome Biol. 0:0-0(2002). 

RN' [2] 

RI? SEQUENCE FROM N.A. 

RA Brover V., Troukhan M . , Alexandrov N. , Lu Y.-P., Flavell R., 

RA Feldmann K. ; 

RT !, L-'ull -Length cONA from Arabidopsis thaliana."; 

RL Submitted (MAR-2002) to the EMBL/ GenBank/DDBJ databases. 

DR EMBL: AY038359; AAM65898.1; 

DR InterPro; IPR004892; Mo25. 

DR Pfam; PF03204; Mo25; 1. 



f Q SS. iCa 34rS in 3 9 8 41 MW; 

Query Match 39.4%; Score 671.5; DB 10; Length 345; 

Best Local Similarity 42.9%; Pred No. J-8e-39; . 

Matcher. 140; Conservative 68; Mismatches 113; Indels 5, Gaps 

12 JCNPAE I VKI LKDNL AI LEKQD KKTDKAS EE VSKS LQAMKE I LCGTNEKE PPTE AVA 67 

12 [ T PQEviLiRisimiDTKTWEvlALEKALEEVEKN F SSLRGILSGDGETEPNADQAV 71 
68 QLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYISAHPHILF 127 
7 2 UALLcKEDWSLviHKLiLGWETR^LLHCWSILLKQKVGDTYCCVQYFEEHFELLD 1 3 1 
: 2,5 MLLKGYEAPQ1ALRCGIMLRECIRHEPLAKIILFSNQ 

132 sLWCYDNKeIIlHCGSMLRECIKFPSLAKYILESACFELFFKFVELPNFDVASDAFSTF 191 

IBS roLLTRHK.LVADFLEQNYDTIFEDYEKLLQSENYVTKRQSLKLLGELILDR™ 

|||||.| • I • - I | : I I : | I = II I I I : I I I I I M '• : I : I 111 
192 UiiiK^SvVSEFLTSHYTEFFDVYERLLTSSNYVTRRQSLKLLSDFLLEPPNGHIMKK 251 

,, g vi3KPE N LKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLI<NQPKLIEFLSS 307 
"" I I I . . I i I ■ I I I I I i I i : I I : ! I I : h I I ■ II : I M : M 

,52 VVILEVRYLKviMTLLKDSSKNIQI^ 311 ' 

30 3 UQKER-TDDEQFADEKNYLIKQIRDL 332 

: ::|:|MII = IMM 
312 LSPGKGSEDDQFEEEKELTIEEIQKL 337 



Qy 

Db 

QY 
Db 
Qy 
Db 

Qy 

Db 



TD LIF Q8L1F3 PRELIMINARY; PRT; 322 AA. 

AC 



RESULT 



OS 

cc 



Q8LIF3; 

01-OCT-2002 (TrEMBLrel. 22, Created) . ^ 
DT 01-OCT-2002 (TrEMBLrel- 22, Last sequence update) 
jjt "i-MAR-2003 (TrEMBLrel . 23, Last annotation update) 
DE Hypothetical protein {P0503D09.26 protein) . 
CM OJ1316_A04.9 OR P0503D09.26. 

r.rvza sativa (japonica cultivar -group) . 

Saryota Siridiplantae; Streptophyta; Embryophyta; Tracheophyta; 
£ermacophyta; Magnoliophyta ; Liliopsida; Scales; Poaceae; 
GC vEhrhartoideae; Oryzeae; Oryza. 
OX ^CBI_TaxlD=3.9947; 

RIT "ill . 

RP SEQUENCE FROM N.A. 

RC STRAIN=cv. Nipponbare; 

Sasaki T . , Matsumoto T. , Yamamoto K. , „ R . p 

S "Oryza sativa nipponbare (GA3 ) genomic DNA, chromosome 7, BAC 
■PT r-lone :OJ1316 A04."; 

RL Submitted (OTL-2001) to the EMBL/GenBank/DDBJ databases. 
RNT [2] 

RP SEQUENCE FROM N.A. 

RC STRAIN=cv . Nipponbare ; 

RA Sasaki T., Matsumoto T. , Katayose Y . ; 



RT "Oryza sativa nipponbare (GA3 ) genomic DNA, chromosome 7, PAC 

RT clone : P0503D09 . " ; 

RL Submitced (JUN-2002) to the EMBL/GenBank/DDBJ databases. 

DR EMBL; AP003822; BAC06992.1; 

DR EMBL; AP005455; BAC16736.1; 

DR Gramene; Q8LIF3 ; -. 

DR InterPro; IPR004892; Mo25. 

DR Pfam; PF03204; Mo25; 2. 

KW Hypothetical protein. 

SQ SEQUENCE 322 AA; 37091 MW; 99434DFA7C2DCD21 CRC64 ; 

Query Match 34.6%/ Score 590; DB 10; Length 322; 

Best Local Similarity 38.5%; Pred. No. 9e-34; 

Matches 12 9; Conservative 73; Mismatches 109; Indels 24; Gaps 4 

rv 4 MPLFSKSHKNPA E I VKI LKDNL A I LE KQD KKTD - KAS EE VS KS L Q AMKEI L C GTN 57, 

| | :: || j : | : : | : : | | | : | | . | | I I : = : : : I I 
Db 1 MSFFFRAASRPARPSPQELVRSIKESLLAL DTRTGAKALEDVEKNVSTLRQTLSGDG 57 

5«3 EKEPPTEA VAQLAQELYS SGLLVTL I ADLQL IDFEGKKDVTQ 1FNNI LRRQI GTRS PTVE 117 

Ml | | hi |= H : =: = =11 = 11= U I = = = h 

Db 5 8 EVE PNQEQVLQ I ALE I CKEDVLS LFVQNMPSLGWEGRKDL AHCWS I LLRQKVDE AYCC VQ 117 

^ v ,,3 YISAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTF 17 7 

I | | =| h i: ::| Mi I 111.11.:: Ml II 1 = I I hi I Mi I 
r )h . : i S 7IENHFDLLDFLWCYKJSTLEVALNCGNMLRECIKYPTLAKYILESSSFELFFOYVELSNF 177 

" 8 DIASDAFATFKDLLTRHKVLVADFLE0NYDTIFEDYEKLLQSENYVTKRQSLKLL3ELII- 2 3 7 

' iilMI IMIIIhh MM I =h M i Ml IMMMMM ! Ml 

S DIASDALNTFKDLLTKHEAAV3EFLCSHYEQFFELYTRLLTSTNYVTRRQSVKFLSEFLL 2 37 



•> 3R DRHNFAlMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILIiKN 2 97 

. : | || MM Mh II I III hhM h-IM 

KAPN AQIMKRYIVEVSYLNIMIGLL KVF V.ANPNKPRD X I QVLVDN 28.2 

Ov QPKLIEFLSSFQKERTDDEQFADEKNYLIKQIRDL 332 

:|:: I : = MM M = = MIM I 

Ob 2 83 HRELLKLLGNLPTSKGEDEQLEEERDLIIKEIEKL 317 

RESULT 11 
.Q8K03 8 

ID Q8.K038 PRELIMINARY; PRT; 103 AA . 

AC Q8K03 8; 

DT 01-OCT-2002 (TrEMBLrel . 22, Created) 

DT 01-OCT-2002 (TrEMBLrel. 22, Last sequence update) 

DT 01-MAR-2003 (TrEMBLrel. 23, Last annotation update) 

DE Similar to RIKEN cDNA 1500031K13 gene. 

OC MUs musculus (Mouse). 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus . 

OX NCBI_TaxID=10090; 

R- : i [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Kidney; 

RA Strausberg R. ; 

RL Submitted (JUL-2002) to the EMBL/GenBank/DDBJ databases. 



DR EMBL; BC034159; AAH34159.1; 

DR InterPro; IPRO 04 8 92; Mo2 5. 

DR Pfam; PF 03204; Mo25; 1. 

SQ SEQUENCE 103 AA; 11291 MW; EA86A9F6E9E426E0 CRC64 ; 

Query Match 25.5%; Score 4 35; DB 11; Length 103; 

Best Local Similarity 97.8%; Pred. No. 1.8e-23; 

Matches 89; Conservative 1; Mismatches 1; Indels 0; Gaps 

0v 4 MPLFSKSHKNPAEIVKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPT 63 

| | | | | | | | | I | I I I I I I I I I I II I I I I I I I i I i 1 1 1 1 1 1 1 M I M I I I I I I I I i : 1 1 1 M 
Db ! MPLFSKSHKNPAEIVKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNDKEPPT 60 

q v 6^- EAVAQLAQELYSSGLLVTLIADLQLIDFEGK 94 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II I II 1 1 1 Ml I 

Db" 61 EAVAQLAQELYS SGLLVTL I ADLQL IDFEVK 91 



RKSULT 12 
02 5 IS 8 

ID 025188 PRELIMINARY; PRT; 677 AA . 

AC 025188; 

DT 01-JAN-1998 (TrEMBLrel . 05, Created) 

DT 01-JAN-1998 (TrEMBLrel. 05, Last sequence update) 

DT 'Vi-MAR-2003 (TrEMBLrel. 23, Last annotation update) 

Di ; ;;ma topoisomerase I (TOPA), . ••/' 

GfT :tpo44 0. 

o;; Helicobacter pylori (Campylobacter pylori) . ■• I-" 

OC Bacteria; Proteobacteria ; Epsilonproteobacteria ; Campylobacter ales ; 

OC . ^lelicobacteraceae; Helicobacter. • 

OX KCBI TaxID=210; 

RVT [1] " ; 

.J.? SEQUENCE FROM N.A. 

?.C STRAIN-26695 / ATCC 700392; 

?X MEDLINE-97394467 ; PubMed=9252185 ; ' y - 

vt± Tomb J.-F., White 0., Kerlavage A.R., Clayton R.A., Sutton G.G.> 

RA Fleischmann R.D., Ketchum K.A. , Klenk H.-P., Gill S., Dougherty B.A. 

RA Nelson K. , Quackenbush J. , Zhou L., Kirkness E.F., Peterson S., 

RA Loftus B., Richardson D. , Dodson R. , Khalak H.G., Glodek A., 

RA McKenney K. , FitzGerald L.M., Lee N., Adams M.D., Hickey E.K.,. 

RA Berg D.E., Gocayne J.D., Utterback T.R., Peterson J.D., Kelley J.M. , 



RA Cotton M.D., Weidman J.M. , F-ujii C. , Bowman C, Watthey L . , Wallin 

RA Hayes W.S., Borodovsky M . , Karp P.D., Smith H.O., Fraser CM. , 

RA Venter J.C. ; 

RT "The complete genome sequence of the gastric pathogen Helicobacter 

RT pylori . " ; 

RL Nature 388:539-547(1997). 

DR SMBL; AE000559; AAD07502.1; -. 

DR TIGR; HP0440; 

OR InterPro; IPR003601; DNAtopI_ATP_bind . 

DR InterPro; IPR003602; DNAtopI_DNA_bind . 

DR InterPro; IPR000380; DNA_tpisomrase . 

DR InterPro; IPR006171; Toprim_dom. 

DR. InterPro; IPR0 0 6154;- Toprim_sub. 

DR Pfam; PF01131; Topoisom__bac ; 1. 

DR Pfam; PF01751; Toprim; 1. 

DR PRINTS; PRO 04 17; PRTPISMRASEI . 



E. 



Qy 
Db 



DR SMART; SM0 043 7; TOPI Ac ; 1. " 

DR SMART; SM00436; TOPlBc; 1. 

DR SMART; 3M0 04 93; TOPRIM; 1. 

KW Hypothetical protein; Isomerase; Complete proteome . 

SQ SEQUENCE 677 AA; 77677 MW; 4B2 8 5B81F10 92BB4 CRC64 ; 

Query Match 7.9%/ Score 134.5; DB 16; Length 677; 

Best Local Similarity 21.6%; Pred . No. 0.24; 

Matches 88; Conservative 58; Mismatches 134; Indels 127; Gaps 16; 

7 FSKSHKNPA-EI VKILKDNL AILEKQDKK TDKASEEVSKSLQAMKE 51 

Y | || | : :| MM I = M M ^ Mlh 

Db 222 FKFKDKNEASQFLKDLKDGLGSMSVLVSLKESLSNKEPKKPFTTSKLLSQASKSLKI-" 278 

52 ILCGTNEKEPPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGT 111 

11= =111-1 hh =lh | = = I h II 

Bh 279 PTKEIAQLAQKLFEAGLITYHRTDSEFLSPEYLKEHEVFFEPIY 322 

112 RSPTV- - EYIS AHPHILFMLLKGYEAPQIALRCGIMLREC1RHE 153 

|:| || : III I I I =1= -= I : I 

3.23 - -psVYQYREYKAGKNSQAEAHEAIRITHPHALKDLEKVCSDAKISEELALKLYQLIYTN J 8.0 

0 „ r: p L ^ _ -AKIILFSNQFRDFFKYVELSTFDIASDAFATFKDLLTRHKVLVADFLEQNYDTIF 2 10- 

^" : :: h ||=- II I - I- = I I I 1=11 

t. /0 - TICSQSRNALY-NQYDCIFK IKSESFKLSFKLLKEKGFLEIEELIQGKEEIN 4 3 I- 

Qv . ,^ pdyeKLLQSENYVTKROSLKLLGELILDRHNFAIMTK^TSKPENLKLMMNLLRDKSPNIQ 2 7 0; 

; |: : ||: | | |= = : = I II = I I : 

r . b RE-EQESEIENFSLKEMDSVPLKEVFIK.K- - - - - IEKPSPKPYKESAFIPLLESEG- 4 81, 

27" FRAFHVFKVFVAS PHKTQP I VE ILLKNQ PKLIEFLSSFQKERTDD- 315 

" " : | : = :|M : = =1 H hi- I 

;r . 432 „.._ : 1 GRP S TY AS FLDLLL KRKY I S I DTKTN A I TPTS QGLE V 1 3 F F KKD KE VD F b3;l 

oi - -EQF-- -ADEKNYLIKQIRDLKKTA 3 36 

^ " :|.| | | M I I 

Db , 532 I ALTS KDKS KLGNTTKQFEECLDL I MRGEAS YEKFMLEVI S KLKSTA 5 78 

RESULT j.\V 

026049 . 
ID 026049 PRELIMINARY; PRT; 430 AA. 

AC 026049; 

DT ul- JAN- 1998 (TrEMBLrel . 05, Created) 

DT OI-JAN-1998 (TrEMBLrel- 05, Last sequence . update ) 

DT 01 -MAR- 2 0 02 (TrEMBLrel. 20, Last annotation update) 

D'E Hypothetical protein HP1520. 

GN IIP1520 

CS Helicobacter pylori (Campylobacter pylori) . 

OC Bacteria; Proteobacteria ; Epsilonproteobacteria ; Campylobacterales ; 

OC Helicobacteraceae; Helicobacter. 

C:< NCBI_TaxID=210; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=26695 / ATCC 700392; 

RX MEDLINE=97394467; PubMed=9252 185 ; 

RA Tomb J.-F., White 0. f Kerlavage A.R., Clayton R.A., Sutton G.G., 



RA Fleischmann R.D., Ketchum K.A. , Klenk H.-P., Gill S., Dougherty B.A., 

RA Nelson K. , Quackenbush J., Zhou L . , Kirkness E.F., Peterson S., 

RA Loftus B., Richardson D., Dodson R . , Khalak H.G., Glodek A. , 

RA McKenney K. , FitzGerald L.M., Lee N., Adams M.D., Hickey E.K., 

RA Berg D.E., Gocayne J.D., Utterback T.R., Peterson J.D., Kelley J.M., 

RA Cotton M.D., Weidman J.M., Fujii C. , Bowman C. , Watthey L., Wallin E., 

RA. Hayes W.S., Borodovsky M., Karp P.D., Smith H.O., Fraser CM., 

RA Venter J.C. ; 

RT "The complete genome sequence of the gastric pathogen Helicobacter 

RT pylori . " ; 

RL Nature 388:53 9-547(1997). 

DR EMBL ; AE000650; AAD08565.1; -. 

BR TIGR; HP152 0; -- 

KW Hypothetical protein; Complete proteome. 

SQ SEQUENCE 430 AA; 50573 MW; 2 3DC6FE5E956B62 9 CRC64 ; 

Query Match 7.5%; Score 128; DB 16; Length 430; 

Best Local Similarity 20.9%; Pred. No. 0.39; 

Matches 82; Conservative 73; Mismatches 135; Indels 102; Gaps 20;, 

q,/ 7 FSKSHKNPAEI VKTLKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKfiPP 62 

| : |: II lllhhh 1= = =lh : | - | = 

Db 60 F YPNRKS KI E I EFNGEKI LKENVAVFHS YDE - - EFS S EDS VTTFMAKSDL --KQQY 111 

0v 53 TSAVAQLAQELYSSGLLVTL--TA DLQ LID FEG KKD VTQ I FNN 1 LR 106 

: :| :| \\ :| I I = = : I I I =1 A I 

:Ay i 12 - N ILLELEKE--KKALLKSLRDIASGFDYEEEIKTIKNEKNKSFYElLDNHLT i EIESSEK 169 

- 07 -RQIGTRSPT7-I3YISAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKII 159 

| i | | r::: | - I ■ . I I- 
~w 1/0 HYS FKYRD I FDGS KICVKDFVNKHHDL I EQYFNKYQ ELLSQSK 2 l.i 

-;; 0 i LF SNQFRDFFKYVELSTFDIASDAFATFKDLLTRHKVLVADFLEQ- - - - - - 204, 

^ | :| I I : I • I ! : : = - 1 ==.!"-! I 

m 212 IFKHMNSGDFGTNHADDLKKALENNRFFKANHSLKIAGKEITNYQKL-SDIFENEKNRIL 270 

n v o OS NYDTIFEDYEKLLQSENYVTKRQSLKLLGELI - LDRHNF- -AIMTKYISKP 2 5.2- 

| - : | : : h I : : I I 'I | | : | : : | : : 

rb 271 NNEELKESFDKI EKVINANKELKAFKDAISKDNTLLTEFLDYDSFRKKVLFSYLKQV 3 27 

Qv -53 - ENLKLMMNLLRDKS PNI QFEAFHVFKVFVAS PHKTQP I VE I LLKNQPKLI EFLS S FQKE 311 

^ :|:| ::|| hi ! h : I * ' -II I! h I I ' 

m . IQNVKSLWLYREKKPEIE EIIKQASKDQKEWESVIEIF- -NQRFLVPFKVELQNQ 38.1 

. q,, ' "V2 R TDDEQ F ADE KNY L I KQ I RD L KK 334 

| | |:|: : I I hi 

Dh 382 KDILLNKDAAQFRFIFSDDNQDMNVQKEDLQK 413 



RESULT 14 
0P"WXU3 

ZZ) 09WXU3 PRELIMINARY; PRT; 1285 AA. 

arj Q9WXU3 ; 

DT 01-NOV-1999 (TrEMBLrel . 12, Created) 
DT 01 -NOV -1999 (TrEMBLrel. 12, Last sequence update) 
DT Ol-MAri-2003 (TrEMBLrel. 23, Last annotation update) 
DE COME protein, putative. 



GN TM0088. 

OS Thermotoga maritima. 

OC Bacteria; Thermotogae; Thermotogales ; Thermotogaceae ; Thermotoga. 

OX NCBIJTaxID=233 6; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=MSB8 / DSM 3109; 

RX HEDLINE=99237316; PubMed=10360571 ; 

RA Nelson K.S., Clayton R.A., Gill S.R., Gwinn M.L., Dodson R.J., 

RA Haft D.H., Hickey E.K., Peterson J.D., Nelson W.C., Ketchum K.A 

RA McDonald L . , Utterback T.R., Malek J.A., Linher K.D., Garrett M.M 

RA Stewart A.M., Cotton M.D.. Pratt M.S., Phillips CA Richardson D . , 

rb Heidelberg J. . Sutton G.G., Fleischmann R.D., EisenJ.A., White O., 

PA Salzberg S.L., Smith H.O., Venter J. C, FraserC.M.; 

RT -Evidence for lateral gene transfer between Archaea and Bacteria trom 

RT genome sequence of Thermotoga maritima."; 

RL Nature 399:323-329(1999). 

DR EMBL; AE001695; AAD35182.1; -. 

DR TIGR; TM0088; -. 

DR Inter Pro; IPR004846; GSPIl/ lllprotein . 

DR InterPro; IPR001993; Mitochcart ier . 

DR Pfam; PF00263; GSPIIJEH; 1- 

DR i/ROSITE; PS00215; M I TOCH _C ARR I ER ; 1. 

KW Complete proteome. 
SO 



J3QUENC3 * 1285 AA; 145209 MW; 057435F821FB0EA5 CRC64 ; 

,. vvMat:1 , 7.2%; Score 123.5; DB. 16; Length 1235 ; 

IiOCB.l 
ftihea 86; 



Db 

Qy 
Db 



I^cal Similarity 21.5%; Pred. No. 3; - ' -■ - 

Conservative 78; Mismatches 129; Hide is 107 ; • 'Gaps . 23 ; 

1 MKKMPLFSKSHKNPAEIVKILKDNLAILEKQD --KKT- - DKASEEV --SKS 45 

" .1 11:1 |: : | h =: Ml Ml I I- I 

555 IiKVAMLSGKEEEN VQKAAEELQIISSEERIIRFVKKTENVPIDKAKNVVDQLYSVS 611/ 

4 6 jjQAMKEILCGTNEICEPPTEAVAQLAQELYS SGL - '•" ""^ L . AD " " 85 " 

612 IEELGNELVVIGErLeVEKaLLqKIFSSEVEISRDFVKLPSWIDEQEKLLEVVKNSA 670. 

"6 -DQLID FEGKXD VTQIFNNILRRQIG- -TRSPTVEYI - - -SAHPHILFML 129 

... I Ml I : : : | : : | : : : i : Ml- Ml I : 

571 GITYEILDGWYFEGTKENVEKAKELFSDIVEK-LGEVRKEETVEFLEVNSSFPVDEFIN 729 

C;. y ,13.0 LKGYEAPQIALRCGIMLRECIRHEPLAKIIL- FSNQFRDFF KYVELST 

Db 73G LSGKLYPDVT- - CFSLDQLGLLVLKGSSEAVEDLSSMYRSFFERHQKIVKENV 780- 

; , 1>7 pp j ASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSLKLLG 233' 

A1 "-II ■ : : I : : | : | | | : = : : I I I I 1 = = : : I I 

Db 7 81 FDRLMLEVPSGFSFEEFKTFLEVLVPEVKQ WYLDKLNLLLVEVPVSQSERVKSLL 836 

-ri ULILDRHNFAIMTKYIS KPENL - KLMMNLLRDKS PN I QFEAF - HVFKVFVAS 283 

I . | : | : | : II I I : : I : : : : I 

Db ,, yl DTF LKKEEAVSEKKAVKSVTIPSGWPDELSSYLKKLLR-,--NVEITVFPNMGQMIVEG 892 

nw , al p-HKTQPIVEILLKNQPKLIEFLSSFQKERTDDEQFADEK 322 

Q ' " •" | .: : || = : = \ • ■ III I ==11 

Db 393 PENEVEKAVELVEAEKEKIV LKERKDYVKVSDGK 926 



RA 
RA 
RA 



Merrick J.M., Glodek A. ,■ 
Fuhrmann J . L . , Nguyen D . 
Sadow P.W., Hanna M,C.,. 



RESULT IB 
05 8 914 

T D Q58914 PRELIMINARY; PRT; 1175 AA. 

AC £58914; 

DT 01-JUN-1998 (TrEMBLrel . 06, Created) 

DT 01-JUN-1998 (TrEMBLrel. 06, Last sequence update) 

DT 01-MAR-2003 (TrEMBLrel . 23, Last annotation update) 

DE Hypothetical protein MJ1519. 

GN MJ1519. 

OS Methanococcus jannaschii. _ 

OC Archaea; Euryarchaeota; Methanococci; Methanococcales; 

DC Methanocaldococcaceae; Methanocaldococcus : 

OX 'NCBI__TaxID=2190 ; 
RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=JAL-1 / DSM 2661 / ATCC 43067; 

RX MEDLINE=96337999; PubMed=8688087 ; 

^ Bult C.J.,- White 0., Olsen G.J. ( Zhou L . , Fleischmann R.D. , . ^ 
Sutton G.G., Blake J . A . , FitzGerald L.M., Clayton R.A. , Gocayne J.D 

_ Kerlavage A.R., Dougherty B.A., Tomb J.-F.. Adams M.D. , Rerch C.I 

RA Overbeek R. , Kirkness E.F., Weinstock K.G 

rip -ncott J.L..- Geoghagen N.S.M., Weidman J.F 
v"} t\ frterha-k . Kelley J.M., Peterson J.D., 

'-otrx- M D Roberts K.M. , Hurst M.A., Kaine B.P., Borodpvsky A. f 
- vleuk H -P., Fraser CM., Smith H.O. , Woese -C . R .,. Venter J.C.; •• 
^ ^ompJet, " genome sequence of the methanogenic archaeon, Methanococcus -. 

•; arinaschii . " ; ' 
P.L Science . 273 : 1058-1073 (1995) . '.. 
CR ,:,;-1BL; U67593; AAB99538.1; -. 
DP. I'lGR; MJ1519; -. 

interPro; IPRC03593 ; AAA_ATPasa . 
ry.7 SMART; SM003 82 ; AAA; 1. 

vv; Hypothetical protein; - Complete proteome. 

CO SEQUENCE 1175 AA; 138618 MW; 99082EA5A4D11140 CRC64; 

r,^r-r Ma-ch 7-0%; Score 120; DB 17; Length 1175; 

Best' Local Similarity 21.5%; Pred. No. 4,8; _ ' 

Matches 76; Conservative 58; Mismatches 131; Indels 88; G*ps- lo. 

7 FSKSHKNPAEIVKILKD-NLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPTEA 65 

I . | ■ I I I I = I II 1 = =1 = I i : I I 

Eb 232 FNKFREENQDFDKYLTDENIAFRPHVMKKFDEFAENIKKVIAELE- - - -GSKYKYPGLPG 287 

, : , 7 6 6 VAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYISAHPHI 125 

Db y LYFLGMEDAYSRYIELWKNEGEKGEEKLYNALI -ESLENRKENLEF- 333 

Qv 1.26 LFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFK YVELSTFDIA- 180 

„ ' GITKKVIDKFIAQKEEFREFLKNYAVYYELSAFKLEK 370 

Db 

Qy 1&1 ' SDAFATF KDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSL 22 9- 

Db 371 I KE QYE KE F I NLDN I I KN P Y I L VED - L KEN DSFERIIFEELDSWERRRLGDKFNP 424 



2 30 KLLGELILDRH NFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAF 274 

M I II II I III : 1 1 : I I : i 

425 YSPYRVRALLVE- ILKRHLSSGNTTISTK DLKDFFEKMDKDI VKITFDEFLRI I 477 

^75 H VFKVF VAS PHKTQP I VE ILLKNQPKL I EFLS S FQKERTDDEQFADEKNYL I K 327 

,\ :: | : : : : |: | | I : - I H = llhl 
473 EEYKD I I S - - EKVE I VKKE VKNNENKE I IELFTLKE IRE YEE I I ENTINYLLK 528 



Search completed: January 7, 2004, 16:48:05 
Job time : 56 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Corapugen Ltd. 



OM protein 
Run on : 



protein search, using sw model 

January 7, 2004, 16:44:17 ; Search time 17 Seconds 

(without alignments) 
932.235 Million cell updates/sec 



Title : US -10 -088- 872-2 

Sequence! 00 ^" ' l^MKKMPLFSKSHKNPAEIVKI FADEKNYL I KQ I RDLKKTAP 33' 



Scoring table-. 



BLOSUM62 

Gapop 10.0 . Gapext 0.5 



Searched: 127863 seqs, 47026705 residues 

Total number of hits satisfying chosen parameters : 

Minimum DB seq length: 0 

Maximum DB ;;eq length: 2000000000 

?0;-, t- r-!rocr -.gsiiK"" : Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



127863 



SwissProt 41 v 



-, r ^ mo is the number of results predicted by chance to have a 
; c ;;: neater than or equal to the score of the result being printed, 
an.l 4 s derived by analysis of the total score distribution. 



SUMMARIES 



•e^ult 
No. 



% 

Query 



Score 


Match 


Length DB 


ID 
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1 


M02L_ 


HUMAN 


1569 


97 . 
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334 


1 


M02L_ 


"mouse 


13 81 


81. 


0 


341 


1 


M02 5_ 


HUMAN 


1376 


80 . 


8 


341 


1 


M02 5_ 


MOUSE 


llll 


65 . 


2 


339 


1 


M02 5_ 


DROME. 


10 06 . 5 


59 


1 


338 


1 


M02M_ 


"CAEEL 


I S 3 .5 


49 


0 


329 


1 


YFV6_ 


SCHPO 


776 


45 


5 


321 


1 


DE76_ 


J2HLPR 


728 


42 


. 7 
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1 


M02N_ 


_ARATH 


716 .. 5 


42 


0 


343 


i 


M02M_ 


"arath 


665 


39 


. 1 


384 


1 


HYMA_ 


EMENI 


G32 


37 


.1 


348 


1 


M02L 


ARATH 




28 


. 5 


399 


1 


HYM1 


YEAST 


143 .5 


8 


.4 


339 


1 


M02L 


_CAEEL 


128 . 5 


7 


. 5 


3911 


1 


AKA9 


_HUMAN 


12 5. 5 


7 


.4 


298 


1 


Y295 


RICPR 


118 .5 


7 


. 0 


959 


1 


DP05 


_SCHPO 



Description 



o 
7 
'a 
9 

-t A 

1.1 

12 

1?; 

1.4 
15 
16 
17 



Q9h9s4 
Q9dblb 
Q9y376 
Q0S138 
P91891 
018211 
Q9p7q8 
Q9xfy6 
Q9fgk3 
Q9m0m4 
060032 
Q9zq77 
P32464 
Q9tzm2 
Q99996 
Q9zdn2 
060094 



homo sapien: 
mus musculU: 
homo sapien 
mus musculu ; 
drosophila 
caenorhabdi " 
schizosacch • 
chlorella p 
arabidopsis 
arabidopsis- 
emericella 
arabidopsis 
saccharomyc 
caenorhabdi • 
h a-kinase 
rickettsia 
schizosacch 



13 


116 . 5 


6 


8 


724 


1 


HMMR_ 


HUMAN 


075330 


homo sapien 


19 


115 


6 


7 


474 


1 


GSHB_ 


MOUSE 


P51855 


mus musculu 


20 


112.5 


6 


6 


1411 


1 


YM42_ 


YEAST 


Q03214 


saccharomyc 


21 


109.5 


6 


4 


978 


1 


RA50_ 


[aquae 


067124 


aquifex aeo 


22 


109 


6 


4 


695 


1 


YCX7_ 


CHLVU 


020159 


chlorella v 


23 


10 9 


6 


4 


1401 


1 


LATA_ 


_LATMA 


P23631 


latrodectus 


24 


108.5 


6 


4 


586 


1 


2A5D_ 


"rabit 


Q28653 


o serine/th 


25 


108.5 


6 


4 


602 


1 


2A5D_ 


_HUMAN 


Q14738 


h serine/th 


26 


108.5 


6 


4 


1075 


1 


Y124_ 


METJA 


Q57588 


methanococc 


27 


108 


6 


3 


568 


1 


DNAB 


PORPU 


P51333 


porphyra pu 


28 


107 .5 


6 


3 


483 


1 


ACPA_ 


_BACAN 


Q44643 


bacillus an 


2 9 


107 .5 


6 


3 


1042 


1 


T1RH_ 


_METJA 


Q60295 


methanococc 


30 


107 . 5 


6 


3 


1726 


1 


MSP1_ 


PLAFC 


P04934 


Plasmodium 


31 


107 . 5 


6 


3 


1726 


1 


MSP1_ 


PLAFP 


P50495 


Plasmodium 


32 


107 


6 


3 


1727 


1 


ALM1_ 


SCHPO 


Q9utk5 


schizosacch 


33 


106 


6 


2 


474 


1 


GSHB_ 


_HUMAN 


P48637 


homo sapien 


34 


105.5 


6 


2 


793 


1 


rega" 


DICDI 


Q23917 


dictyosteli 


35 


105.5 


6 


2 


847 


1 


RSG2_ 


RAT 


Q63713 


rattus norv 


3.6 


104 .5 


6 


1 


1701 


1 


MSP1_ 


_PLAFF 


P13819 


Plasmodium 


37 


104 .5 


6 


1 


1701 


1 


MSP1_ 


PLAFM 


P085 6 9 


Plasmodium 


38 


104 


6 


.1 


85 9 


1 * 


MUTS 


AQUAE 


066652 


aquifex aeo 


39 


104 


6 


. 1 


1290 


1 


RA5 0_ 


_SCHPO 


Q9utj8 


schizosacch 


4 0 


104 


6 


. 1 


1682 


1 


MSP1_ 


_PLAF3 


P19598 


Plasmodium 


41 


103 .5 


6 


. 1 


641 


1 


PRIM_ 


]UREPA 


Q9ppz6 


ureaplasma 


4 2 


103 


5 


. 0 


2 6 63 


1 


CENE 


HUMAN 


Q02224 


homo sapien 


43 


102.?; 


6 


, o 


502 




URIC 


JBACSB 


Q45697 


bacillus sp 


44 


102 .5 


6 


. 0 


975 


1 


KINH_ 


DROME 


P17210 


drosophila. 


4 5 


102 .5 


6 


. 0 


12 0.2 


i 


RFM2] 


_YEAST 


Q02773 


saccharomyc 



ALIGNMENTS 



RESULT 1 *■ ■ .-■ 

MO 2 L_HUMAN 

ID MO 2 L_HUMAN STANDARD; PRT; 3 34 AA . 

AC Q9H9S4; Q9BZ33; 

DT 16-OCT-2001 (Rel . 40, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE M025-like protein. 

OS Homo sapiens (Human) . 

OC Sukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Sutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE OF 4-334 FROM N . A. 

RA Isogai T., Ota T., Hayashi K. , Sugiyama T. , Otsuki T. , Suzuki Y. , 

RA Nishikawa T. , Nagai K. , Sugano S., Shiratori A., Sudo H. , 

RA V T agatsuma M., Hosoiri T... Kaku Y. , Kodaira H. , Kondo H. , Sugawara M -> , 

R.A Takahashi M., Chiba Y. , Ishida S., Murakawa K. , Ono Y . , Takiguchi S., 

RA Watanabe S., Kimura K. , Murakami K. , Ishii S., Kawai Y., Saito K. , 

RA Yamamoto J., Wakamatsu A., Nakamura Y., Nagahari K. , Masuho Y . , 

RA Ninomiya K. , Iwayanagi T. ; 

RT M NEDO human cDNA sequencing project. "; 

RL Siibmitted (AUG-2000) to the EMBL/ GenBank/DDBJ databases. 

RM [2] 



cc 
cc 
cc 



RP SEQUENCE OF 276-3 34 FROM N. A. 
RA Pearce A. ; 

RL Submitted (JAN-2001) to the EMBL/GenBank/DDBJ databases. 
CC -!- SIMILARITY: Belongs to the Mo25 family. 

CC 

CC This SWISS -PROT entry is copyright. It is produced through a collaboration 
CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 
CC the European Bioinf ormatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; AK022639; BAB14147.1; ALT INIT. 

DR EMBL; AL13887LS; CAC28084.1; 

DR InterPro; IPR004892; Mo25. 

DR Pram; PF03204; Mo25; 1. 

SQ SEQUENCE 334 AA; 38728 MW; 97702273D8548432 CRC64; 

Query Match 98.9%; Score 1685; DB 1; Length 334; 

Best Local Similarity 99.7%; Pred. No. 1.3e-100; 

Matches 333; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 
Qy a ..MPLFSKSHKNPAEIVKILKDNLAILEKQDKKTDKASEEVSKSLQAMKE 6 3 



i II M I 



1 1 i 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M i 1 1 1 1 ! I 



Db , MPLFSKSHKNPAEIVKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPT 6 0 

0,v - ^ 4 EAVAQL AQE L YS S GLLVTL I ADLQL I D FEGKKD VTQ I FNN I LRRQ I GTRS PT VE Y I S AHP 123. 

1 Mllllt I M 1 1 1 1 i ! 1 1 1 ( M M I M ! 1 1 1 f 1 i 1 1 M 1 1 1 1 ■ 1 1 1 1 1 M 1 1 1 1 

Db EAVAQLAQELYSSGLLVTLIADLQLID FEGKKD VTQI FNN ILRRQIGTRSPTVEY IS AHP 120. 

n , r 94 HILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKITLFSNQFRDFFKYVELSTFDIASDA 183 

IMMIIIMIIIilMliiilillMMIIl MIMIMIiMllliMIMIiilli 

Db 121 HTLFMLLKGYEAPQIALRCGIMLRECIRHEPLVKIILFSNQFRDFFKYVELSTFDIASDA 180 



184 FATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSLKLLGELILDRHNFA 243 

1 1 1 1 1 i 1 1 1 1 1 1 r 1 1 1 1 1 i ii ii i ii in 

Db 181 FATFK DLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSLKLLGELI : LDRHNFA 240 

rn , 744 IMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVA3PHKTQPIVEILLKNQPKLIE 3 03 

- 1 1 1 ! 1 1 i t r t f 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 f 1 1 1 1 1 1 1 1 1 1 1 1 1 1 _ 

Db 241 IMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQPKLIE 300 

0v -5.13*1 FLSSFQKERTDDEQFADEKNYLIKQIRDLKKTAP 337 

" 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 II 1 1 1 

Db 301 FLS S FQKERTDDEQ F ADE KN YL I KQ I RDLKKT AP 334 



RESULT 2 
t402L_MO':.S:, 

ID ~~M02I, _ MOUSE STANDARD; PRT; 3 34 AA. 

AC Q9*B16; Q8BGS2 ; Q91WB8; Q91YL0; 

DT lb-CCT-2001 (Rel. 40, Created) 

DT 15-SEP-2003 (Rel. 42, Last sequence update) 

DT 15-53?- 2003 (Rel. 42, Last annotation update) 

DE M025 like protein. 

OS Mus munculus (Mouse) . 



OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus . 

OX NCBI_TaxID=100 90; 

RN [1] 

RP SEQUENCE FROM N.A. (ISOFORMS 1 AND 2) . 

RC STRAIN=C57BL/6J; 

RC TISSUE=Cerebellum, Eye, Pituitary, and Testis; 

RX MEDLINE=22354683 ; PubMed=12466851 ; 

RA Okazaki Y., Furuno M . , Kasukawa T., Adachi J., Bono H. , Kondo S., 

RA Nikaido I., Osato N. , Saito R., Suzuki H., Yamanaka I., Kiyosawa H., 

RA Yagi K. , Tomaru Y., Hasegawa Y. , Nogami A., Schonbach C, Gojobori T. , 

RA Baldarelli R. ; Hill D.P., Bult C. , Hume D.A. , Quackenbush J., 

RA Schriml L.M., Kanapin A., Matsuda H., Batalov S., Beisel K.W.,. 

RA Blake J. A., Bradt D. , Brusic V., Chothia C, Corbani L.E. , Cousins 3., 

RA Dalla E., Dragani T.A. , Fletcher C.F., Forrest A., Frazer K.S., 

RA Gaasterland T . , Gariboldi M . , Gissi C, Godzik A., Gough J., 

RA Grimmond S., Gustincich S., Hirokawa N., Jackson I.J., Jarvis E.D.,- 

RA Kanai A., Kawa j i H., Kawasawa Y. , Kedzierski R.M., King B.L., 

PA Konagaya A., Kurochkin I.V., Lee Y., Lenhard B., Lyons P. A., 

RA Maglott D.R., Maltais L., Marchionni L . , McKenzie L., Miki H., : 

RA Nagashima T., Numata K. , Okido T., Pavan W.J., Pertea G. , Pesole G . , 

RA Petrovsky N. , Pillai R. , Pontius J.U. , Qi D., Ramachandran S.,.. 

PA Ravasi T. , Reed J.C. , Reed D.J., Reid J. , Ring B.Z., RingwaldM., 

RA Sandelin A., Schneider C, Semple C.A., Setou M . , Shimada K. , • 

RA Sultana R.. Takenaka Y. , Taylor M.S., Teasdaie R.D., Tomita M..,. 

RA Verardo R. , Wagner L., Wah-lestedt C, Wang Y. , Watanabe Y . , Wells C. , > 

RA Wilmin.g L.G., Wynshaw-Bori s A., Yanagisawa M . , Yang 1., Yang L.;, 

RA Yuan Z. . Zavolan M. , Zhu Y. , Zimmer A., Carninci P., Hayatsu N : -- # 

RA Hirozane-Kishikawa T. , Konno H., Nakamura M. , Sakazume N . , Sato K.,; 

RA Shiraki T. , Waki K. , Kawai J., Aizawa K. , Arakawa T., Fukuda S>.. J' 

RA Hara A., Hashizume W.. Imotani K. , Ishii Y. , Itoh M. , Kagawa I.., 

PA Miyazaki A., Sakai K. , Sasaki D., Shibata K. , Shinagawa A., ^ 

RA Yasunisbi A., Yoshino M . , Waters ton R. , Lander E.3., Rogers J.-., 

RA Birney E . , Hayashizaki Y. ; :• 

RT "Analysis of the mouse transcriptome based on functional annotation of 

RT 60,77 0 full-length cDNAs . " ; 

RL Nature 420:563-573 (2002) . 

RN [2] - 

RP SEQUENCE FROM N.A. (ISOFORM 1) . 

RC STRAIN=FVB/N; TISSUE = Mammary gland, and Salivary gland; 

RX MEDLINE=22388257; PubMed-124 77 932 ; 

RA Strausberg R.L., Feingold E.A., Grouse L.H., Derge J.G., 

RA Klausner R.D., Collins F.S., Wagner L. , She.nmen C.M-, Schuler G.D., 

RA Altschul S.F., Zeeberg B., Buetow K.H. , Schaefer C.F., Bhat N.K., 

.RA Hopkins R.F., Jordan H. , Moore T., Max S.X., Wang J., Hsieh F . , 

RA Diatchenko L. , Marusina K. , Farmer A. A. , Rubin G.M., Hong L., . " 

RA Staple ton M . , Soares M . B . , Bonaldo M.F., Casavant T.L., Scheetz . T . E.. ,. 

RA Brownstein M.J. , Usdin T.B., Toshiyuki S., Carninci P., Prange C, 

RA Raha S.S., Loquellano N.A. , Peters G.J., Abramson R.D., Mullahy S.J., 

RA Bosak S.A., McEwan P.J., McKernan K.J. , Malek J. A., Gunaratne P.H., 

RA Richards S., Worley K.C., Hale S., Garcia A.M., Gay L.J., Hulyk.S.W.., 

RP. Viilalon D.K., Muzny D.M., Sodergren E.J., Lu X., Gibbs R.A., 

RA Fahey J. , Helton E . , Ketteman M. , Madan A., Rodrigues S., Sanchez A., 

RA Whiting M., Madan A., Young A.C., Shevchenko Y., Bouffard G.G., 

RA Blakesley R.W. , Touchman J.W. , Green E.D., Dickson M.C., 

RA Rodriguez A.C., Grimwood J., Schmutz J., Myers R.M., 

RA Butcerfield Y.S.N., Krzywinski M.I., Skalska U. , Smailus D.E., 



RA 
RT 
RT 
RL 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
DR 
DR 
DR 
DR 
DR 
DR 
DR 



DR 
i?T 
FT 

r t 

FT 

FT 
£Q 



Schnerch A. , Schein J.E., Jones S.J.M., Marra M.A.; 

"Generation and initial analysis of more than 15,000 full-length human, 
and mouse cDNA sequences."; 

Proc. Natl. Acad. Sci . U.S.A. 99:16899-16903(2 002). 
- ! - ALTERNATIVE PRODUCTS : 

Event ^Alternative splicing; Named isoforms=2; 

Name=l; 

IsoId=Q9DB16-l; Sequence=Displaye.d; 
Name =2 ; 

IsoId=Q9DBi6-2; Sequence=VSP_0 074 17 , VSP_007418; 
Note=No experimental confirmation available; 
-- ! - SIMILARITY: Belongs to the Mo25 family. 

This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 
the European Bioinf ormatics Institute. There are no restrictions on its 
use .by * non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http ? //www. isb- sib . ch/ announce/ 
or send an email to license@is.b-sib.ch) . 



EMBL; 
F.MBL ; 
IjMBL; 
3MBL; 
EMBL; 
EMBL; 
KMBL; 



AK005323; 
AK030474; 
AK0 5 3 64 2; 
AK076758; 
AK076867; 
3CC16128; 
BC016546; 



BAB23953 . 
BAC26978 , 
BAC35457. 
BAC3 64 70 . 
BAC3 6513 
AAH16128 
AAH16546 



ALT 
ALT 
ALT!. 
ALT 



INIT. 
"iNIT. 
INIT. 
INIT . 



.'4GD ; MGl : 


1916258; 150003 lK13Rik . 




InterPro; 


IPR004892; Mo25. 






Pfam; PF03 2 04; Mo2 5; 1. 






Alternative splicing. 






\?ARSPLIC 


2 76 


293 


VF VAS PHKTQ PI VEILLK 


-> NSVFITNRIHGLKRWLSS 








(in isoform 2) . 










/FTId=VSP_007417 . 




VARSPLIC 


294 


334 


Missing (in isoform 2) . 








/FTId=VSP_007418 . 




CONFLICT 


42 


42 


3 -> P (IN REF. 1; 


BAB23953 ) . • ./ 


CONFLICT 


22 9 


229 


L -> R (IN REF. 2; 


AAH16546) . 


h SEQUENCE 


334 £A; 


3871B. MW; 


822F04A87FB4EB6F 


CRC64 ; 


Query Match 




97.9%; 


Score 1669; DB 1; 


Length 3 34; 


Sest -Local Similarity 


98.5%; 


Pred. No. l.4e-99; 





Matches 32 9; Conservative 



2; Mismatches 



3; Indels 



Gaps 



0; 



Vjv- 



4 MPLFSKSHKNPAEIVKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPT 63 

M 1 1 1 1;: ll I :i 1 1 1 II 1 1 1 1 ' 1 1 1 1 1 II M 1 1 M 1 1 .III M 1 1 1 II 1 1 1 1 hit II I 

1 MPLFS KSHKNPAE I VKI LKDNLAI LEKQDKKTD KAS EE VS KS LQAMKE I L CGTNDKE P PT 6 0 



Qy 

Db 

Qy 

Db 



^4 EAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYISAHP 

1 1 1 1 1 1 ! i 1 1 1 M 1 1 1 1! 1 1 1 1 1 1 1 1 i I M ! 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 MIIIMMI 

61 SAVAQLAQELYS SGLLVTLI ADLQLIDFEGKKDVTQI FNNI LRRQIGTRCPTVE YI S SHP 



123 . 



120. 



124. HILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDIASDA 183 



II 



1 1 1 1 1 1 1 ll 1 1 ii 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 



HUM! 



MINIM 



121 HILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDIASDA 180 



Qy 


184 


F Z^TFKDLLTRHKVLVADFLEQNYDT I FED YEKLLQS ^^yT^® ^^^T^ 

VTi 1 1 1 1 ii 1 1 1 1 1 1 1 1 M I i 1 1 1 M 1 1 II 1 1 1 1 M N 1 1 1 1 1 


24: 


Db 


181 




24 


Qy 


244 


IMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVAS^^ 

n Mi ill llllll Ml II III Ml MUM II III 1 I IIIIIUIMIMIIIN 


30 


Db 


241 


ImUyIsUe^ 


30 


Qy 


3 04 


FLS S FQKERTDDEQFADEKNYL I KQIRDLKKTAP 33 7 




Db 


301 


IIMMIMMIIIMIMMIIIMMIM M 

FLS S FQKERTDDEQFADEKNYL I KQ IRDLKKAAP 334 





RESULT 3 

M02 5 HUEMi __ 
ID "m02 5_HUMAN STANDARD; PRT ; 3 41 AA . 

AC Q3Y37 6; 

DT 16-OCT-2001 (Rel . 40, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT 15-CEP-2003 (Rel- 42, Last annotation update) 

DS M025 protein (CGI-66) . 

GN M025. 

OC ..w..!ia; Eutheria; Primates; Cat.rrhmi; Honumdae; Homo. 

CX. ~:>JC&I _TaxID=9606 ; ; . 

r>t .;:.] • • . 

pt sequence from n . a. 

ov MEDLINE=20272150; PubMed-10810093 ; ; ; . / 

:;V r _ H cho u C -Y., Ch'ang L.-Y., Liu C.-S:, Lm W.-C; 

identification of novel human genes evolutionary conserved m 
Caenorhabditis elegans by comparative proteomics . « ; 
Genome Res. 10:703-713(2000). 

XX [2] : 
ryp SEQUENCE FROM N.A. 

TISSUE=Hypothalamus; TT Vll v t?,, r 

S Jin W., Sni Rerx S., Gu J., Fu S., Huang Q., Dong H., Yu Y., 



RA 
P.T 



RX 
FX 
P. A 



Wana Y . , Chen Z., Han Z. ( _ 

novel gene expressed in the human hypothalamus. ; 
RL Submitted (DEC-1998) to the EMBL/GenBank/DDBJ databases. 

RN [3] 

P.? SEQUENCE FROM N.A. 
•RC TISSUE=Duodenum; 

MHDLINE=22388257; PubMed=12477932 ; 

Strausberg R.L., Feingold E . A. , Grouse LX, Derge J - G _< ~ 
Klausner R.D., Collins F.S., Wagner L. , Shenmen C^M Schuler G.D., 
PA Mtschul S.F., Zeeberg B., Buetow K.H., Schaefer C.F.. Bhat N.K., 
RA Hopkins R.F., Jordan H. , Moore T. , Max S.I., Wang J. Hsieh F., 
£ Satchenko L. . Marusiha K. , Farmer A.A. , Rubin G M Hong L 
ot* ctaolefon M. . Scares M.B., Bonaldo M.F., Casavant T.L., Scheetz T E 

'" ap " ' T r= ^ n T R Toshivuki S., Carninci P., Prange C, 

PA Drownstem M.J., Lsdin r.B., losniyuM o , Milll , w q iT 

RA -aha S.S., Loquellano N.A.. Peters G.J., Abramson R D. , Mullahy S.J. 
S Bosak S.a'. . McEwan P.J: , McKernan K.J. , Malek J.A 

RA Richards S., Worley K.C., Hale S-. Garcia A-M Gay L-J.^ Hulyk S.W, 
RA Villalon O.K., Muzny D.M., Sodergren E.J., Lu X.. Gibbs 
RA Pahe- J. , Helton E . , Ketteman M. , Madan A., Rodrxgues S Sanchez A 
RA Whiting M . , Madan A., Young A.C., Shevchenko Y . , Bouffard G.o., 



RA 
RA 
RA 
RA 
RT 



Blakesley R.W., Touchman J.W. , Green E.D., Dickson M.C., 
Rodriguez A.C., Grimwood J- , Schmutz J., Myers R.M.. 
Butterfield Y.S.N. , Krzywinski M.I., Skalska U. , Smailus D.E., 
Schnerch A., Schein J.E., Jones S.J.M.. Marra MA; 
-Generation and initial analysis of more than 15,000 full-length 
RT human and mouse cDNA sequences 

RL Proc, Natl. Acad. Sci. U.S.A. 99:16899-16903(2002). 
CC -!- SIMILARITY: Belongs to the Mo25 family. 

rr This SWISS-PROT entry" is copyright. It is produced through a collaboration 
CC between the Swis-s Institute of Bioinf ormatics and the EMBL 
CC the European Bioinf ormatics Institute. There are no restrictions on its 
CC u=e by non-profit institutions as long as its content is in no way 
CC modified and this statement is not removed. Usage by and for commercial 
CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
CC or send an email to license@isb-sxb . ch) . 

CC 

DR EMBL; AF151824; AAD34061.1; -. 

DR EMBL; AF113536; AAF14873.1; 

DR EMBL; 3C020570; AAH20570.1; -. 

DR InterPro; IPR004892; Mo25. 

DR Pfam; PF03204; Mo25; 1. 

SO SEQUENCE 341 AA; 39869 MW; EC7 10A52 8B6F9811 CRC64 ; 

Cuerv Match 81.0%; • Score 1381; DB I; Length 341; 

Bff-'.ocal Similarity 81.0%; Fred. No. 3.1e-81; _ 
■ Maters 273; Conservative 31; Mismatches 29; Indels 4; «aps- 2, 

4 MPL-FSKSHKNPAEIVKILKDNLAILEKQ---DKKTDKASEEVSKSLQAI ; ^ILCGTNEK 5.9 

j I I Ml 1,1 hill ||:::|:|!ll III N I N' I I II = I I I I I I I UNI 
1 MPFPFGKSHKSPADI^ 60 

EPPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYI 119- 

y/ ^ u Mil I III II hi III I HI I Nil III 1 1 II II IN INN III IN II IN 

61 EPQTEAVAQLAQELYNSGLLSTLvJ^ 120 

120 sahphilfmllkgyeapqialrcgimlrecirh™ 179 
•IIIIIHIIhlNN IN NNN IN 1 1 INNNNIN an 

121 CTQQnIlFMLlIgYESPExALNCGIMLRECIRHEPLAKIILWSEQFYDFFRYVEMSTFDI 180. 
0V 180 ASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQ^ 239 

1 1 1 M 1 1 1 1 1 1 1 1 i II : I h 1 1 II : 1 1 I NNN 1 1 1 N 1 1 N II I N 1 1 h I II 

j& 181 isDAFATF^LLTRHkLSAEFLEQHYDRFFSEYEKLLHSENYVTKRQSLKLLGELLLDR 240- 

Oy 240 l^FAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFip/FyASPHKTQPIVETLL^QP 299 

"• in l II II 1 1 1 1 II 1 1 1 1 II 1 1 1 1 N III IINHNNIhNIIIII 

Db 241 HNFTIMTKYISKPENLKLMMN^ 300 ' 

3 00 KLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLKKTA 33 6 

Ml I III II N INN I III NNN II 11= i n 

Db 301 KLIEFLSKFQNERT5DEQFNDEKTYLVKQIRDLKRPA 3 37 



Db 



Kb 



RESULT 4 
MC2 5 MOUSE 

ID ~MC25_ MOUSE STANDARD; PRT ; 341 AA. 

AC Q0613&; 



DT 01-FEB-1994 (Rel . 28, Created) 

DT 01-FEB-1994 (Rel. 28, Last sequence update) 

DT 16-OCT-2001 (Rel- 40, Last annotation update) 

DE M025 protein. 

GN M02 5 OR CAB3 9 . 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodent ia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=93119656; PubMed-8418 809 ; 

RA Miyamoto H., Matsushiro A., Nozaki M. ; 

RT "Molecular cloning of a novel mRNA sequence expressed in cleavage 

RT stage mouse embryos."; 

RL Mol. Reprod. Dev. 34:1-7(1993). 

CC -!- FUNCTION: ONE OF THE FIRST GENES TO BE TRANSCRIBED DURING MOUSE 
CC DEVELOPMENT, MAY PLAY SOME GENERAL FUNCTION. 

CC -!- SUBCELLULAR LOCATION: Cytoplasmic (Potential). 

CC -!- DEVELOPMENTAL STAGE: TRANSCRIBED DURING EARLY MOUSE DEVELOPMENT. 
CC DETECTED AT ALL DEVELOPMENTAL STAGES FROM THE EGG THROUGH THE 

CC BLASTOCYT , MOST ABUNDANT AT THE 2 -CELL STAGE. 

CC -J- SIMILARITY: Belongs to the Mo25 family. 

CC ------- 

CC This SWISS -PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bidinf ormatics ..and the EMBL out station - 

cc !;he European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial. 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-3ib.ch). 

ft ;'1 ._ ' 

DR SXHL; S51858; AAB24301.1; - . 

Dk MGD; MGI : 10 74 38 ; Cab3 9 . 

DR InterPro; IPR0 04 8 92; Mo2 5. 

DR' Pfam; PF03204; Mo25; 1. 

SQ SEQUENCE 341 AA; 39842 MW; E7F66852 9D6FE811 CRC64; 

Query Match 80.8%; Score 1376; DB 1; Length 341; 

Best Local Similarity 80.7%; Pred . No. 6.5e-81; 

Matches 272; Conservative 32; Mismatches 29; Indels 4; Gaps 2; 

MPL-FSKSHKNPAEIVKILKDNLAILEKQ- - - DKKTDKAS EEVS KS LQAMKEI LCGTNEK 59 

|| I I Ml - 1 i : 1 1 1 lh-|:MII Mi MhMMIM MMM Mill 

MPFPFGKSHKSPADIVKNLKESMAVLEKQDISDKKAEKATEEVSKNLVAMKEILYGTNEK bO 
EPPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYI 119 

M IIIIIMIMMMMI I h I II II M I II M 1 1 I II I II II 1 1 M I h II 1 1 II 

EPQTEAVAQLAQELYNSGLLGTLVADLQLIDFEGKKDVAQIFNNILRRQIGTRTPTVEYI 12 0 
SAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDI 179 

M 1 1 II II 1 1 1 : h I M II II MM MMM MM hi II Mhllhlllll 



ASDAFATFKDLLTKHKVljV/UJr ijiiUWiJJ i j.r ^.JJiriru-iijyociM x v x i\x^v^- LJ ^ LJ - LJV - , ^ , - LJ J - J - , ^ iV 

IIMIMIMI IIMhl hiilh.ll I =11111 IMIIIIIIIIIIIIIhlll 

ASDAFATFKDLLTRHKLLSAEFLEQHYDRFFSEYEKLLHSENYVTKRQSLKLLGELLLDR 240 



Qy 


4 


Db 


1 


Qy 


50 


Db 


61 


Qy 


12 0 


Db 


121 


Qy 


180 


Db 


181 



n „ 240 HNFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQP 

Qy in | Ml III Ml Mill III MM III MhhMMh: | 

Db 241 HNFTIMTKYISKPENLKLMMNLLRDKSRNIQFEAFHVFKVFVANPNKTQPILDILLKNQT 

nv 300 KL I EFLS S FQKERTDDEQF ADE KNYL I KQ I RDLKKTA 336 

IIMIII II =II = »IM Ml Ihllthlh I 

Db 3 01 KLIEFLSKFQNDRTEDEQFNDEKTYLVKQIRNLKRAA 33 7 



DT 
ST 



1V.1 



RA 
RA 
RA 
RA 
RA 
RA 
RA 
RA 



RESULT 5 
M02 5_DROME 

ID M02 5 DROME STANDARD; PRT; 33 9 AA. 

AC P91891; Q9W85; 

DT 16-OCT-2001 (Rel. 40, Created) 

16-OCT-2001 (Rel- 40, Last sequence update) 
16-OCT-2001 (Rel. 40, Last annotation update) 
DE M02 5 protein (dMo2 5) . 
GN ' M025 OR CG4083 . 

OS, Drosophila melanogaster (Fruit -fly) . 

0C Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; 
OC Neoptera; Endopterygota ; Diptera; Brachycera; Muscomorpha; 
OC Ephydroidea; Drosophilidae ; Drosophila. 
OX NCBI__TaxID=7227; 
RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=96268479; PubMed-8 67 2 24 7 ; 

Nozaki M.,Onishi Y. , Togashi S., Miyamoto H. ; 

"Molecular characterization of the Drosophila Mo25 gene, wmch.is 
conserved among Drosophila/ mouse, and yeast.",; 
DNA Cell Biol. 15:505-509 (1996).. 
RN [2] 

RP SEQUENCE FROM N.A. 
RC STRAIN=Berkeley; 

■*X MEDLINE = 20196006; PubMed=1073 1132 ; 

RA Adams M.D., Celniker S.E., Holt R.A. , Evans C.A., Gocayne J.D., 
RA Amanatides P.G., Scherer &.E., Li P.W., Hoskins R.A., Galle R.F., 
RA George R.A., Lewis S . E . . Richards S., Ashburner M. , Henderson S.N., 
c utcon g G., Wortman J.R., Yandell M.D., Zhang Q.., Chen L.X., 
Brandon R.C., Rogers Y.-H.C, Blazej R.G., Champe M., Pfeiffer B.D., 
Wan K.H. Doyle C. , Baxter E.G., Helt G . , Nelson C.R., Miklos G.L.G., 
Abr-il J.F., Agbayani A., An H.--J-, Andrews -Pfannkoch C, Baldwin D., 
Ballew R.M., Basu A., Baxendale J., Bayraktaroglu L., Beasley E.M., 
Bee son K.Y., Benos P.V., Berman B.P., Bhandari D., Bolshakov S., 
Borkova D., Botchan MR. , Bouck J., Brokstein P., Brottier P., 
Curtis K.C., Busam D.A., Butler H. , Cadieu E., Center A., Chandra I., 
*A Cherry J.M., Cawley S., Dahlke C ., Davenport L.B., Davies P., 
RA de Pablos B., Delcher A., Deng Z., Mays A.D. , Dew I., Dietz S.M., 
RA ' Dodson K., Doup L.E., Downes M . , Dugan-Rocha S., Dunkov B.C., Dunn P 
RA Durbin K.J., Evangelista C.C., Ferraz C, Ferriera S., Fleischmann W 
RA ^o^l^r C , Gabrielian A.E., Garg N.S., Gelbart W.M., Glasser K. , 
RA Glodek A., Gong F . , Gorrell J.H., Gu Z., Guan P., Harris M. , 
RA karris N.L., Harvey D . , Heiman T.J., Hernandez J.R., Houck J., 
RA Hostin D . , Houston K.A., Howland T.J., Wei M.-H., Ibegwam C, 
RA Jalali M . , KaluEh F., Karpen G.H. , Ke Z., Kennison J. A., Ketchum K.A 
RA Kimmel 3.E., Kodira CD., Kraft C, Kravitz S., Kulp D . , Lai Z., 
RA Lasko P., Lei Y. , Levitsky A. A. , Li J. , Li Z., Liang Y., Lm X., 



RA Liu X , Mattel B., Mcintosh T.C., McLeod M.P., McPherson D., 

RA Merkulov G. , Milshina N.V., Mobarry C, Morris J., Moshrefi A., 

RA Mount S.M., Moy M., Murphy B., Murphy L., Muzny D.M. , Nelson D.L., 

RA Nelson D.R., Nelson K.A. , Nixon K. , Nusskern D.R., Pacleb J-M-, 

PA Paiazzolo M . , Pittman G.S., Pan S., Pollard J., Puri V., Reese M.G., 

RA Reinert K. , Remington K. , Saunders R.D.C., Scheeler F., ShenH., 

RA Shue B C, Siden-Kiamos I., Simpson M. , Skupski M.P., Smith T. , 

RA Spier E., Spradling A. C, Stapleton M. , Strong R. , Sun E 

RA Svirskas R. , Tector C, Turner R. , Venter E., Wang A.H. , Wang X., 

RA Wang Z -Y., Wassarman D.A., Weinstock G.M., Weissenbach J., 

RA Williams S.M., Woodage T. , Worley K.C., Wu D., Yang S. Yao Q.A 

v e J Yeh R.-F., Zaveri J.S., Zhan M. , Zhang G., Zhao Q. , Zheng L . , 
Zheng'x.H., Zhong F.N. , Zhong W. , Zhou X., Zhu S., Zhu X., Smith H.O. 
Gibbs R.A., Myers E.W., Rubin G.M., Venter J.C.; 
RT :, The genome sequence of Drosophila melanogaster . " ; 
RL Science 287:2185-2195 (2000) . ■ 
CC -!- SIMILARITY: Belongs to the Mo25 family. 



RA 
RA 
RA 



CC 



This SWISS-PROT entry is copyright. It is produced through a collaboration 

C- between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is m no way 

cr Todified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

DR TMBL; AB00 0402; BAA19098.1; 

DR »!BL; AF.003526; AAF49422 . 1 ; ' - . 

OR- rlvBase; FBgn0017572; Mo25. 

DR InterPro; IPR004892; Mo25 . ; 

r.J> i>fam; PF03204; Mo25; 1. 

h^ CONFLICT 51 51 Y ->H (INREF. 1). 

CONFLICT 102 102 V -> L (IN REF . I). 

};Q SEQUENCE 33 9 AA; 39383 MW; 5790BD91754C1C74 CRC64 ; ■ , 

Qu-ry Match 65.2%; Score 1111; DB1; Length 339; 

Best Local Similarity 65.0%; Pred. No. 4.9e-64; _ , 

Matches 217; Conservative 59; Mismatches 54; Indels 4; 'Gaps - ' 



4 MPLFSKSHKNPAEIVKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPT 63 

I | | | || | : | | : | I I h : II l : l =11 I : I I I : I : : I : I l- : HI _ 
Pjb i MPLFGKSQKSPVELVKSLKEAINALEAGDRKVEKAQEDVSKNLVSIKNMLYGSSDAEPPA 60 

64 E-AVAQLAQELYSSGLLVTLIADLQLIDFEGKKBVTQIFNNILRRQIGTRSPTVEYISAH 122 

: MlhlMhl lh II =1 I I I I I I I . I I I I I : I I I I I I I I M I I I i I 
I& • B1 D Y WAQL SQEL YNSNLLLLL I QNLHRI D FEGKKHVAL I FNNVLRRQ I GTRS PTVE Y I CTK 120 

r.v m PHILFMLLKGYE- - APQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDIA 180: 

Q/ " 'l III h III hill I Mill 1 = 1 II I hi h = l I hi I hi I II 1 1 

D& V21 p E iLFTLMAGYEDAHPEIALNSGTMLRECARYEALAKIMLHSDEFFKFFRYVEVSTFDIA 180 

181 SDAFATFKDLLTRHKVLVADFLEQNYDTIF-EDYEKLLQSENYVTKRQSLKLLGELILDR 239 

1 1 1 hi! hi I II I hi hlh III I = h = ll Mil II = 11 II III II 1 = 1 il 

Db- 181 SDAFSTFKELLTRHKLLCAEFLDANYDKFFSQHYQRLLNSENYVTRRQSLKLLGELLLDR 24 0- 

-mo HNFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQP 299 

Ml :|hiihllllllllhh = ll IIIIIIIIIMIMhhl =lh = llhll 



Db 



Qy 

i I ^ I I 

D b 3 01 KLVDFI 



RESULT 6 
M02M CAEEL 



241 HNFTVMTRYISEPENLKLMMNMLKEKSRNIQFEAFHVFKVFVANPNKPKPILDILLRNQT 300 
30 0 KL IE FLS S FQKERTDDEQFADEKNYL I KQ I RDLK 33 3 



.TNFHTDRSEDEQFNDEKAYLIKQIKELK 3 34 



ID 



II 



Caenorhabditis elegans . • , Dh h J • t-oidea ■ 

Eukaryota; Metazoa; Nematoda; Chromadorea; Rhabditida; Rhabditoidea , 
Rhabditidae; Peloderinae; Caenorhabditis. 



RL 

ft r\ 

cc 



M02M_CAEEL STANDARD; PRT; 33 8 AA. 

AC 018211; 

DT 16-OCT-2001 (Rel . 40, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

typ ->6-OCT-2001 (Rel. 40, Last annotation update) 

DE Hypothetical M025-like protein Y53C12A.4 in chromosome 

CN Y53C12A.4 
OS 
OC 

oc 

OX NCB I_TaxID= 623 9; 
RN [1] 
.RP SEQUENCE FROM N.A. 
RC STRAIN-Bristol N2 ; 

t?a 7 ^rs'haw J., Lennard N. ; , 
Submitted (SEP- 1997) to the EMBL/ GenBank/DDB J databases. 
•> SIMILARITY: Belongs to the Mo25 family. 

"■h -r-WTqq-PROT entry is " copyright, - It is produced through a collaboration, 
: 'h i institutfof Bioinformatics and the EMBL -ou t s tat x on r 
opi Bioinformatics Institute. There are no restrictions on its 
bv non-profit institutions as long as Its content is m no way 
%* h -S? statement is not removed. Usage by and for commercial 
SitUs retires f li=«se , 9 re.„,e„t (See http ,//*.. isb- S ib. eh/.™*.-/ 
CC or send an email to license@isb-sib . ch) . , ^ 

DR EMBL; Z99277; CAB16486.1; -. •* 

OR PIR; T27129; T27129. 

DR WormPep; Y53C12A.4; CE14890. 

DR InterPro; IPR004892; Mo25. 

DR Pfam; PF03204; Mo2 5; 1. 

Qttnrv b'B±ch 59.1%; Score 1006.5; DB 1 ; Length 338 ; 

Ber^:' Local Similarity 57.2%; Pred No, 2 . 2e-57 ; . _ ' 

Matches 191; Conservative 60; Mismatches 78; Indels 5, Gaps 

Cv 5 PLFSKSHKNPAEIVKILKDNLAILEK- - - - -QDKKTD^SEEVSKSLQAMKEILCGTNEK 59 

4 pIfgJo^ktpadwUrUlvidrhgtntserkvekaieetakmlalaktfiygsdan 



CC 



Db 



63 
119- 



,0EPP_AOELVS 

Db 64 EPNNEQVTQLAQEVYNANVLPM 123 

Qy ,20 S AHPHI LFMLLKGYEAPQ I ALRCGI MLREC IRHEPLAKI I LFSNQFRDFFKYVELSTFD I 179 

: | I || i | Ml I III II 1 1 1 1 =11! 1 1 : 1 = I •• I h II *h Ml 



124 AARPEILITIjLLGYEQPDIALTCGSMLREAVRHEHLARIVLYSEYFQRFFVFVQSDVFDI 183 
180 ASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKL^ »9 



D b 184 ATDAFS 



slFiiiMTKiUcAEYLDNNYDRFFGQYSALTNSENYVTRRQSLKLLGELLLDR 243 



Hl i JJ E ll ie l^^ 303 



Db 244 HNFSTMNKY 

300 KLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLK 333 



K1VEFLTAFHNDRTNDEQFNDEKAYL IKQIQELR 3 3 7 



QY 

Db 3 04 

RESULT 7 

YFV6 SCHPC a a 

ID ~YFV6_SCHPO STANDARD; ' 32 9 AA 

AC Q9P7Q8; 

DT 16-OCT-2001 (Rel. 40, Created) 

DT 
DT 
DE 

GN SPAC1834.06C 
OS 

cc 



1S-OCT-2001 (Rel- 40, Created) 
16.OCT-2001 (Rel. 40, Last sequence update) 
28-'EB-2003 (Rel. 41, Last annotation update) 
Hypothetical protein C1834.06c in chromosome I. 



SPAC.L8J* . U6t. . 

^h^zosaccharomyces pombe (Fission yeast .. 
Euk.ryota; Fungi; Ascomycota; Schizosaccharomycetes ; 
r Schizosaccharomycetales; Schizosaccharomycetaceae , 
oc ochizosac'charomyces . '• 
OX NCEI TaxID=489S; 
RJT [U 

?.S? SEQUENCE FROM N.A. 
RC 3TRAIN=972; 

HX • MEDLINE=21848401; PubMed=118593oO ; - Stewart A 

RA wood v.^wmi^ J ;> Bowman s ., 

R' TZVlV^l'.'. Browns., Chillingworth T. , Churcher C : ± 



RA Brooks K., Brown D Brown ^ - ^ T . , Eraser A., 

RA Collins M . , Connor R . , Cronm A. , Davis Hodgson G . , 



RA 
RA 



m....-.„^ w T.vlor R.G., Tivey A., Walsh S./., waireu , 



r , V ravlor R G Tivey A., Walsh S.V., Warren ' " niL 
Taylor K. Taylor > * Robben j. # Grym0 nprez B., 

RA Woodward Volckaert G^ ( Aerc , M ueller-Auer S., 

RA Weltjens I., Vanstreels E. ' r u^lzer' E Moestl D. , Hilbert H., 
- f be1 ' C R ' Tanger-i A.', Lehrach h'. , Reinhardt R. , Pohl T.M., 

*V :-' fM u A., Cadieu E . , Dreano S., Gloux S. Lelaure V t i r 



S S^. K -ZimSmann-W We,,, « R Purj.ll. - 

RA Goffcau A., Cadieu E . , Dreano S., Gloux S ., L-la 

RA Gilbert P.. Aves S.J Xiang Z Tillada V Garzon A., Thode G., 

RA Lucas M . , Rochet M. , ^" ^ ' Sanchez M del Rey P.. Benito J., 

- ^ n ^ z 'A Cr Tefuelt; ^Lrenol^t^g -/™g S.L., 
p| " r ti L ;'Lowe T . , Mccombie W.R., Paulsen I ■ , Potashkm J., 

S Shpakovski G.V., Ussery D: , Barreil B.C., Nurse P.; 



RT "The genome sequence cf Schizosaccharomyces pombe . " ; 
RL Nature 415:871-880(2002). 

CC ■-!- SIMILARITY: Belongs to the Mo25 family. 



CC 
CC 
CC 
CC 



This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinf ormatics and the EMBL outstatxon - 
-he European Bioinf ormatics Institute. There are no restrictions on its 



CC use by non-profit . institutions as long as its content is m no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; AL157734; CAB75774.1; -. 

DR PIR; T50117; T50117. 

DR GeneDB_SPombe; SPAC1834 . 06c ; -. 

DR InterPro; IPR004892; Mo25. 

DR Pfam; PF032 04; Mo25; 1. 

KW Hypothetical protein. 

SO SEQUENCE 329 AA; 38521 MW; 073DD0607A64C952 CRC64 ; 

Querv Match 49.0%; Score 834.5; DB 1; Length 329; 

Best* Local Similarity 51.5%; Pred. No. 1.9e-46; ■ 

.Matches 169; Conservative 63; Mismatches 93; Indels 3; .Gaps - Z; 



ry.r 



j.. n 



£b .122 
Qy 185 



' t r-'S KSHKNPAE I VKILKDNLAI LE - KQDKKTDKAS EEVSKS LQAMKE I LCGTNEKEPPTE 64 

^ |: ::|: | Ml - II III I : I I'l I I II Mil I ! I = 

■1 LFNKR PKSTQDVVRCIjCDNLP KXjE INNDKK- -KSFEEVSKCLQNLRVSLCGTAEVEPDAD 61 

65 AVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNHILRRQIGTRSPTVEyiSAHPH 124 

| : | : : : | I | h I • • I I III I I • ' I I i : : I M I : i • M I 
"62 LVSDLSFQIYQSNLPFLLVRYLPKLEFESKKDTGLIFSALLRRHVASRYPTVDYMLAHPQ 121; 

ILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDIASDAF 184 

| :\: I ::| L : | j I I I I I I ' : : I I I I I : : MINIMI 

.22 IFPVLVSYYRYQEVAFTAGSILRECSRHEALNEVLLNSRDFWTFFSLIQASSFDMASDAF 181 



244 



ATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSLKLLGELILDRHNFAI 

:||| :| || ||:|: -I I : I I I M I I I M M I I II I II I : : M I I 
Db ia2 3TFKSILLNHKSQVAEFISYHFDEFFKQYTVLLKSENYVTKRQSLKLLGEILLNRANRSV 241 

Ow 9*1* MTKY I SKPENLKLMMNLLRDKS PNI QFEAFHVFKVF VAS PHKTQP I VE I LLKNQPKLI EF 304', 

\ " \\:\\\ Mlilll | I I I I I MIIMMMMIhl I- : : I II :h IN = 
Db 2.A2 MTRYISSAENLKLMMILLRDKSKNIQFEAEHVFKLFVANPEKSEEVIEILRRNKSKLISY 301. 

Gv 3Q^ LSSFQKERTDDEQFADEKNYLIKQIRDL 332 

^ " |.| : | :| Hill lh -MM I 
Db ?,02 LSAFHTDRKNDEQFNDERAFVI KQI ERL 32 9 



RESULT 8 
DE76 CHL?:r 

ID ~DE7 6_CHI,PR STANDARD; PRT; 321 AA. 

AC Q3.XFY6; 

DT 16-OCT-2001 (Rel . 40, Created) 
DT 16--OCT-2001 (Rel. 40, Last sequence update) 
DT 23-FE3-2003 (Rel. 41, Last annotation update) 
DE Degreening related gene dee76 protein. 



GN DEE7 6. 

OS Chlorella protothecoides . nhinrpllales- 

CC Eukaryota; Viridiplantae; Chlorophyta; Trebouxxophyceae ; Chlorellales , 

OC Chlorellaceae; Auxenochlorella . 

OX NCBI_TaxID=3 07 5; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAI.N=ACC25; 

RX MEDLINE=20256472; PubMed=107 98614 ; 

11 Hortensteiner S., Chinner J., Matile P . , Thomas H. f Donnison I.S.; 

S ^Sorophy^rbrekdown in Chlorella protothecoides: characterization 

RT of degreening and cloning of degreening-related genes."; 

RL Plant Mol. Biol. 42:439-450(2000). 

CC -!- SIMILARITY: Belongs to the Mo25 family. 

rr This'^WISS-PROT'entry'iBCopyright. It is produced through a collaboration 

CC between the Swiss Instituted Bioinf ormatics and the EMBL out station - 

CC the European Bioinf ormatics Institute. There are no restrictions on is 

CC use by non-profit institutions as long as its content is in no way 

CC moaifild and rhis statement is not removed. Usage by and for commercial 

CC entitles requires a license agreement (See http://ww.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib.ch) . 

CC ~- 

DR EMBL; AJ238632; CAB42595.1; -. 

DR ' InterP.ro; IPR004892; Mo25. 

tjv Pfatrt- PF03204; Mo25: 1. ' 

SQ ■ .SEQUENCE 321 AA; 37262 MW; 918FD02964B09071 CRC64 ; 

Ouery Mate, 45.5%; Score 776; DB 1; Length 321: 

Best Local Similarity 52.0%; Pred, No. 9.8e-43; 

Matches 156; Conservative 56; Mismatches 84; Indels 4, Gaps 3, 



Qy 




DKKTDKASEEVSKSLQAMKEILCGTOEKEPPTKAVAQIAQELYGS 

-li- | = s||:: =:|| ='l ! f= 1 H 1 H= M = ' :: N 


91 - 


cb 


19 


ES KQDRWED I S KAIMS I REAI FGEDEQS S S KEHAQGI ASEACRVGLVSDLVT YLT VLDF 


78 ■ 


Qy 


92 


EGKKDVTQ I FNNI LRRQI - - GTRS PTVE YI S AHPHILFMLLKG YEAPQ I ALRCG IMLREC 
1 .||| 111 1:1 : | | | :|: Ml =1 1 Ml MM II MM 


149 




7 9 


ETFiCDW 


137 


Qy 


150 


IRHEPLAKIILFSNQFRDFFKYVELSTFDIASDAFATF^LLTRH^ 


209 


Db 


138 


IRHEDIAKFVLECNLFEE^ 


197 


Qy 


210 


FEDYEKLLQSENYVTKRQSLKLLGELILDRHN^ 

1 . | | | 1 : | | | | : | | | | | | | 1 1 1 : 1 1 1 1 IMIM M Ml MM Ml 


269 


Db 




FSQLDKLLTSDNYVTRRQSLKLLGELLLDRVNVKIMMQYVSDVNNLILMMNLLKDSSRSI 


257 


Qy 


270 


QFEAFHVFKVFVASPHKTQPIVEILLKNQPKLIEFLSSFQKERTDDEQFADE 

[Ml mi || hi hi:! hh MIMMh M 1 M MM MM Mhl 


32.9 


Db 


25c 


qUafhvfct^ 


316 


RESULT 9 

M.C2N ARATH 
ID MG2NARA 
AC Q9FGK3 ; . 


TH STANDARD; PRT; 343 AA. 





DT 16-OCT-2001 (Rel . 40, Created) 

DT 16-OCT--2001 (Rel. 40, Last sequence update) 

DT 16-OCT-2001 (Rel. 40, Last annotation update) 

DE Hypothetical M025-like protein At5g47540. 

GN AT5G47540 OR MNJ7 . 13 . 

OS Arabidopsis thaliana (Mouse-ear cress) . 

OC *3ukarvota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; 

OC Spermatophyta; Magnoliophyta ; eudicotyledons ; core eudicots; Rosidae; 

OC eurosids II; Brassicales; Brassicaceae ; Arabidopsis. 

OX NCBI_TaxID=3 7 02 ; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=cv. Columbia; 

RA Kaneko T. , Katoh T., Asamizu E. , Sato S., Nakamura Y. , Kotani H., 

RA Tabata S. ; 

RT "Structural analysis of Arabidopsis thaliana chromosome 5. XI. ; 

RL Submitted (APR-1999) to the EMBL/GenBank/DDBJ databases. 

CC -!- SIMILARITY: Belongs to the Mo25 family. 



CC 
CC 



This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinrormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is m no way 

<-C modified and this statement is not removed. Usage by and for commercial. 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb - sib . ch) - 

CC 

DP. EMBL; AB025628; BAB09080.1; -. 

..pR. Inter Pro; IPR004892; Mo25. ' • 

DIl Pfam; PF03 2 04; Mo2 5; 1. 

KW Hypothetical protein. 

3Q SEQUENCE 343 AA; 39457 MW; 4 695 0D6A9A82 FBB5 CRC64 ; 

Query Match 42.7%; Score 728; DB 1; Length 343; 

Best" Local Similarity 43.2%; Pred. No. 1.2e-39; 

Matches 147; Conservative 79; Mismatches 100; Indels 14; Gaps 4; 

6 LFSK SHKNPAEIVKILKDNLAILEK '- -QDKKTDKASEEVSKSLQAMKEILCGTNE 58 

~ Y || : | i ■ : | : .: || == =111= I = I : * = : I I I I I = I 

Db 4 LFKSKPRTPADLVRQTRDLLLFSDRSTSLPDLRDSKRDEKMAELSRNIRDMKSILYGNSE 63 

59 KE P PTEAVAQLAQEL YS SGLLVTL I ADLQL IDFEGKKDVTQ I FNN ILRRQIGTRS PT VE Y 118 

l| || Mi II "= I II I - I HI 11= h 1=1= =1 : I 

Db 64 AEPVAEACAQLTQEFFKEDTLRLLITCLPKLNLETRKDATQWANLQRQQVNSRLIASDY 123- 

c . r ,- L , T ISAHPH1LFMLLKGYEAPQIALR.CGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFD 178 

^ : |: :: :|::|:| Y|| I I I I I I I I s s I I : J I : =11 I -I M 
Db 124 LEANIDLMDVLIEGFENTDMALHYGAMFRECIRHQIVAKYVLESDHVKKFFDYIQLPNFD 183 

ry . n 79 IASDAFATFKDLLTRKXVLVADFLEQNYDTIFEDY-EKLLQSENYVTKRQSLKLLGELIL 237 

||:|| MM H IN II I hi I =1 I I M Mhl Ihhll-Mlh-I 
Db 184 1 AADAAATFKELLTRHKSTVAEFLTKNEDWFFADYNS KLLES SNY ITRRQAI KLLGD ILL 243 

nv 238 DRHNFAIMTKYISKPENLKLMMNLLRDKS PNIQFEAFHVFKVFVASPHKTQPIVEILLKN 2 97 

Y || I I : I I I I : I I HI llllMhl h H II Ih I 

Db 044 DRSNS A^TKYVS S RDNLR I LMNLLRE S S KS I Q I E AFHVF KL FAANQNKPAD I VN I LVAN 303 



Qy 298 QPKLIEFLSSFQKERTDDEQFADEKNYL1KQI RDL 332 



Pb 



304 RSKLLRLLADLKPDK-EDERFEADKSQVLREIAALEPRDL 342 



RESULT 10 

M02M ARATH ^43 AA 

T D MO 2 M _ARATH STANDARD; ' 343 AA . 

zvc Q9M0M4; 02 3 570 ; 
DT 16-OCT-2001 (Rel . 40, Created) 
^ i 6 ^OCT-2001 (Rel. 40, Last sequence update) 

15-SEP-2003 (Rel. 42, Last annotation update) 
Hypothetical M025-like protein At4gl7270. 
AT4G17.?70 OR DL4670W. 



DT 
DT 
DE 
ON 
OS 
OC 
OC 
OC 

OX NCBI_TaxID=3702 ; 
RN HI 

RP SEQUENCE FROM N.A 
RC STRAIN=cv. Columbia; 



; „Mao ? sis thali.na Wous.-ear cress, phyta; Tracheophyt 

eurosids II; Brassicales ; Brassicaceae ; Arabidopsis. 



vy 



MEDLINE=98121113; PubMed=9461215; h ^ ^ c > . 

Bevan M . , Bancroft I Bent E Lo ma w Dro8t L ... 



PA Bevan M. . Ban ro t , — stiekema W. , Drost 

^ ^rgkamp P.- , Hirkse W van Staveren pif£anelll p., 

pa H-Ldlev P., Hudson S.A., Rate! K. , Piurpn> Terrvn N. 

"Zjn-r v Wedler B. , Wambutt R . , Weitzenegger T . , Pohl T. Tenyn n . 
: ? .:-.!" e W ■ * ' vmSroei R- » De Clercq R. , van Montagu M. , Lechar^y A, 
' ' T le ' V ' -r rrpis M ., Lao N., Kavanagh T., Hempel S., ■; 
,S ^rp^'^ian'^O^^gerM., Schaefer M Funk B - - ■ 

\ ' q qilvevM James R. , Monfort A. , Pons A., 
- - 3L ^f:"ecr p" DoukfA i voukelatou E . , Milioni D . , Hatzopoulos P-. , 
,-u.igdomenech P., DouKa a Duesterhoeft A., Moored T.„ 

Piravandi E. , - er B Hi Ibert^, ^ ^ g , 

5? Soke ""-Be^gerT. D^seny M . , Voet M., Volckaert 0., Mewes H.-W., 

S cn_e 4 of 

Arabidopsis thaliana."; 
71 Nature 391:485-488 (1998) . 



Rr; [2] 

RP SEQUENCE FROM N.A. 

PC ^TRAIN=cv. Columbia; 

a i»«»-jf«^:: o s; t ... »«p»y «... «•• 

»• K DueBtertoeft A. , stiekema «.. Entian K.-D., Terryn «.. 
I ... .^t j.. craven „,. 

P-A rel M . DeLsen i, • ■ ■ ■ M . , Bai E I 



Duesterhoeft A., Stiekema W. , Entian K.-D., Terr 

«. v-os P Hoheisel J.. aimmermannW.. Wedler H. . Ridley P.. 

i» Langham S.-».. Mccullagh B Bilha„ ^«f b *"f- ' Vandenbussch e F . , ' 

m ,aa *r Sch.erea d Or y«o„ e^ > .huan^ ». ^ 

' T — ~ - aTlHl . a peters S., van Staveren M . , Dirkse w., 
da Holze ^ E . , Brandt A., a ' _ , I^^ot-^^-r P 



RA 
RA 
RA 



De Keyser A., Buysshaert C. , Gielen J., Villarroel R De Clercq 
van Montagu M. , Rogers J., Cronin A., Quail , Bray- Allen S 
Clark L., Doggett J.. Hall S.. Kay M . , Lennard N. , McLay K Mayes R-, 

S =22 5:: S^r^-":'~T:'^« v;- 

RA Schnabl S., Hiller R. , Schmidt W. , Lecharny A., Aubourg S., 

^ m w ronke R Berqer C, Monfort A., Casacuberta E . , 

£ Gibbons r" weber l'.'. Vandenbol M . , Bargues M. , Terol J., Torres A 

RA Gibbons T eber ' Johnson S., Tacon D., Jesse T . , 

S nSnenT schwa™ 1 Seller P., Heber S . , Francs P BielKe C 
ne±J ' „ aa n Tpmrke K Mewes H.-W., Stocker S., 

S ££S 5: : b" n S: : EES £«. . a. .1. »• ■ • 

RR Sekhon M. .Murray J. . Sh «« ^ C ord.= ^ j _ _ 

~r= I:: s; r« Y l: , doud ,.: .»««.. ***** ->.,. 

„!„x P Bentley D . , Fulton B . , Miller ». , Greco T., Kemp K. 

Fulton L., Mardis E . , Dante M . , Fepin K. , miller L. , 

„ XT i r^^jtfrr^r^J^ c, 

s redone r^o„ r«»r^3 4 : 

r-h-n F Marra M. , Martienssen R. f McCombie W.R, ; 
S- "Sequence and analysis of chromosome 4 of the plant Arabrdops^ 

pt ..tUialiana . " ; - ■ 

RL. Nature 402:769-777(1999). 

7CA ,13] . 
p i> SEQUENCE FROM N . A. . 
:fiTFAIN=cv. Columbia; 

• . • zr --la-trie p w Ecker J . R . , Theologis A.; 
S "i" E raraoido p :i s S £ ulr' length CDMA clones (RAFLS ) se q „e„ced by the 
*T SSP consortium (Salk/Stanf ord/PGEC) ■"■ , . 

BL Submitted (MAY-2001) to the EMBL/GenBank/DDBJ databases. 

i ctwittuRTTY- Belcnqs to the Mo25 family . 
CC - I- SSfLu sequence differs from that shown due to erroneous 
CC gene model prediction. 



RA 
RA 
RA 
RA 
RA 
RA 



CC 



^"swIss-PROT^tty is copyright. It is produced through a c£ljbo»i£m 
^„ institute of Bioinformatics and the EMBL outstation 
% tSTurop an Bloin ( cr»,tics Institute. There are no restrictions on u = 
ic A by "non-profit institutions as lone „ «. =0^™^. l ° 0 ^ rc S 

-. • ^ • j j x.-u-jc, chpii-pmpnt is not removed. usage Jjy WJ - ^ 
£ -Sties reouires a UcSse a g reeme„t <Fee http://-w.isb-sib.ch/announce/ 
and an email to lieense@isb-sib.ch). ^ ■ 

EMBL ; 7,97343 ; CAB10508.1; ALT _SEQ . 
OP EMBL; AL16154 6; CAB78730.1; 
DR EMBL; AF380659; AAK55740.1; 
DR. InterPro; 1PR0048 92 ; Mo25 . 
Pfara; PF03204; Mo2 5; 1- 



CC or se. 
CC 



DR 



KV7 Hypothetical protein. 



SQ SEQUENCE 343 AA; 39650 MW; D340B49A4924B7D1 CRC64 ; 

M . . 42.0%; Score 716.5; DB 1 ; Lenyth343; 

Query Matcu 6.5e-39; 

^f^rlJ^ConseLattve^Ss: Matches 105; mdeis 9; Caps 3, 

6 LFSKSHKNPAEIVKIIiKDNLAILEK QDKKTDKASEEVSKSLQAMKEILCGTNE 53 

4 ilKSKPRTUDlwQTR^i^ADRSNSFPDLRESKREEKMVELSKSXRDLKLILYGNSE 63 

59 KEPPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGK^ 

S 4 AEPVAEACAQLTQEFFKADTLRRLLTSLPNLNLEARKDATQWANLQRQQVN 12 



Db 

QY 
Db 



3 

' '^J^ . 1"?«NFAIMTKYISKPBNLKLMMMLLRDKSPMIQFSAPHVFCTFVRSP 



)PKLieFLSSFQKERTDDEQFADEKNYLlKQIRDLK 333 

. II: | : : :: =! : ' I Q 

304 "ilJKLLRL LAD I KPDK - SDER FDADKAQ WRE I ANLK iS8 



RESULT 1.1 

ID ~HYMA_EmENI STANDARD; 384 ' 

AC 060032 ; 

DT 1S-OCT-2001 (Rel. 40, Created) ' 
DT 16-OCT-2001 (Rel. 40, Last sequence update) 
11 16-OCT-2001 (Rel- 40, Last annotation update) 
DE Conidiophore development protein hymA. 
GN HYMA . 



■ ,i, „-;H,iian=: -AsDerqillus nidulans) . 
% £SZ LcScoL, Pezizo.ycotina; Euro t io„ y ce t es ; 

sirotial'es; Trichocomaceae; Emericella. 



oc 

OX KCBI__TaxID=162425; 
RlM Li] 

? d SEQUENCE FROM N.A. 

^ L/KDLINE^99126010; PubMed=992 8 930 ; 



RX 
RA 
RT 
RT 



pr MoV~Gen Genet. 260:510-521 (1999) . 

^- FUNCTION: Required for conidiophore development 
SUBCELLULAR LOCATION: Cytoplasmic. 
SIMILARITY: Belongs to the Mo25 family 



CC 
CC 
CC 
CC 
CC 



. . T . ■ r^T-^Hnr-pd throuqh a collaboration 



DR EMBL ; A JO 01157; CAA04556.1; 

DR InterFro; IPR004892 ; Mo25 . 

rip of am- PF032 04; Mo2 5; 1. m ~^ A 

SQ SEQUENCE 384 AA; 44392 MW; 2E203D0D110C5FD6 CRC64 ; 

Query Match 39.1%; Score 666; DB 1; Length 384; 

~ - i o^n^rifv 39 8%; Pred. No. 1.2e-35; 

££h£ "S; Conservative '68; Mismatches 114; Indels 30; Gaps 
Qy 12 KNPAEIVKILKDKLAILEKQDKKTD^ 71 

Db X1 ^isDWRsilDLLRi-REPSTAsivEDELAKQLSQMKLMVQGTQELEASTDQVHALVQ 69 

cv 72 ELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILR RQIGTRSPTVEYI -SAHPHIL 126 

70 AMLHEDLLYELAVALHNLPFEARKDTQTIFSHILRFKPPHGNSPD ^9 

■ ■ 164 

•j 3 , 7 yMLLKGYEAPQIALRCGIMLRECIRHEPLAKI ILFSNQ - 

,.,0 "ELCRGYEHSQSAMPCGTILR^ 189 

-FRDFFKYVELSTFDIASDAFATF^ 

l" II • • • II I I I;! I : : : II I I I : I I : I I : I : I I I • I. 

190 VFWRFFHWIDRGTFELSADAFriFRElLTRHKSLVTGYLATNFDYFFAQFNTF.I.VQSESY 249 

223 VTKRQSLKLLGELILDRHNFAIMTKYISK^ 

• I t I I I . I I I I 1 • • I I ! I : : : I : : | Nil : I : I Mill U 1 ( 

25 g NrrKROSIK^ 

283 SPHKTQPIVEILLKNQPKLIEFLSSFQKERTDDEQFADEKNYLIKQIRDL 334 

. I I . : I | : | : : | : | | I = I I I I : I I I I I : : I : : I I I ' 
210 npdIsVAVQRILINNRDRLLRFLPKFLEDRTDDDQFTDEKSFLVRQIELLPK 361 



Db 

Qy 

■1} :■ 



Qy 

Db 



RESULT 12 

KOHL ARATK . 

ID _ M02L_ARATH STANDARD; ' 348 AA ' 

AC Q9ZQ77; 

DT 16 -OCT- 2001 (Rel - 40, Created) 
DT 16-OCT-2001 (Rel. 40, Last sequence update) 
iv- 1S-OCT-2001 (Rel. 40, Last annotation update) 

DE Hypothetical M025-like protein At2g03410. 
GN AT2G03410 OR T4M8.16. 

OC eurosids II; Brassicales; Brassicaceae; Arabidopsrs . 
OX NCBI_TaxID=3 7 02; 
RI'7 [1] 

RP SEQUENCE FROM N.A. 
RC STRAIN=cv. Columbia; 



O 



RX MEDLINE=20083487; PubMed=10617197 ; 

RA Lin X Kau] S., Rounsley S.D., Shea T.P., Benito M.-I., Town CD., 
S fSu'c.Y , Mason T.M. , Bowman C.L., Barnstead M.E., Feldblyum T.V.. 
RA Buell C.R-, Ketchum K.A., Lee J.J., Ronning CM., Koo H.L., 
RA Moffat K.S., Cronin L. A. , Shen M. , Pai G., Van Aken S-, Umayam L. , 
"~ Tallon L J., Gill J.E., Adams M.D., Carrera A.J., Creasy T.H., 
Goodman H.M., Somerville C.R., Copenhaver G.P., Preuss D 
Nierman W.C., White O. , Eisen J. A., Salzberg S.L., Eraser C.M., 

m "Sequence C and analysis of chromosome 2 of the plant Arabidopsis 

RT thai i ana." ; 

RL Nature 402:761-768(1999) . 

CC -!- SIMILARITY: Belongs to the Mo25 family. 



RA 
RA 
RA 



CC 



rr This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC ietw^en the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

rr Edified and Lis statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib.ch). 

CC """" 

DR SMBL; AC0062 84 ; AAD1743 5 . 1 ; 

DR PIR; 384448; B84448. 

DR InterPro; IPR004892; Mo2 5. 

DR Pfam; PF03204; Mo2B; 1. 

• ■a/ Hypothetical protein. ' 

Uo ■ SEQUENCE 348 AA; 40000 MW; AB1D92EA2E2B900E CRC64 ; - 

Query Match - 37.1%; Score 632; DB1;. Length 348; V, : 

Best Local Similarity 38.7%; Pred. No. 1.6e-33; / 
Match-- 133: Conservative 80; Mismatches 117; Indels 14; .Gaps.,, 5 , 



QY 



Do 



GFSKSHKNPAEIVKILKDNLAILEKQDKKTDKASE- - '-. EVSKSLQAMKE1LCGTNF. a 8 

j I •1111: = | = I = I = = = = I I = | :::::: | | | | I 

4 LFKNKSRLPGEIVRQTRDLIALAESEEEETDARNSKRLGICAELCRNIRDLKS ILYGNGE S3 

59 KEP^TEAVAQLAQELYS SGLLVTL I ADLQLIDFEGKKD VTQI FNNI LRRQ IGTRS PTVEY 118 

|l" II | || : : | || : =11 HI Ml I : : : I : I U 
. 64 AEPVPEACLLLTQEFFRADTLRPLIKSIPKLDLEARKDATQIVANLQKQQVEFRLVASEY 12 3 

Uv 119 ISAHPHILFMLLKGYEAP-QIALRCGIMLRECIRIlEPLAKTILFSNQFRDFFKYyELSTF 177 

rfc 124 ^ESNLDVIDsivEGIDHDHELiLHYTGMLKECVRHQVVAKYILESKNLEKFFDYVQLPYF 183 

ry, l-'8 DIASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYE-KLLQSENYVTKRQSLKLLGELI 236- 

y " l.l-M | :: | | II | | ||:s| :||: 1=1 IN'- =11111=1111 = = = 

184 ivATDASKIFRELLTRHKSTVAEYLAKNYEWFFAEYNTKLLEKGSYFTKRQASKLLGDVL 24 3 

2 - 7 LDRHNFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLK 2 96- 

" -l| I =( !l = l = I I = = I I I I I I = : I I I I I I I : I I = I I I = : I : ! I I I = ft , 
MDRSNSGVMVKYVSSLDNLRIMMNLLREPTKNIQLEAFHIFKLFVANENKPEDIVAILVA 303 



Db 

QY 
Db 



Qy 297 NQPKLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLK KTA 336 

304 NRTKILRLFADLKPEK-EDVGFETDKALVMNEIATLSLLDIKTA 346 



.Ob 



RESULT 13 
KYM1 YEAST 

ID ~HYM1_YEAST STANDARD; PRT; 399 M. 

AC P32464; 

DT 01-OCT-1993 (Rel . 27, Created) 

DT 01-OCT-1993 (Rel. 27, Last sequence update) 

DT 16-OCT-2001 (Rel. 40, Last annotation update) 

DE HYM1 protein. 

GN HYM1 OR ?KL18 9W. 

OS Saccharomyces cerevisiae (Baker's yeast). 

OC Eukaryota; Fungi; Ascomycota; Saccharomycotina; Saccharomycetes ; 
OC Saccharomycetales; Saccharomycetaceae ; Saccharomyces. 
OX NCBI_TaxID=4 93 2 ; 
RN [1] 

RP . SEQUENCE FROM N.A. 
RC STRAIN=GRF88 ; 

RX MEDLINE=93348778; PubMed=83 94042 ; 
RA Cheret G . , Mattheakis L . C . , Sor F.; 
. RT "DNA sequence analysis of the YCN2 region of chromosome XI in 
RT Saccharomyces cerevisiae."; 
RL Yeast 9 :661-667 (1993) . 
RN [-2] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=942 05264; PubMed=8154135 ; 

RA Wiemann S., Voss H. , Schwager C, Ru PP T . Stegemann J , 

zimmermann J. , Grothues D . , Sensen C, Krfle H. , Hewitt N. , ■ 
. pa Banrevi A. , Ansorge W.; 1 " _ ' . ' ' 

^ -Sequencing and analysis of 51.6 kilobases on the left arm of - ^ . 
RT chromosome" XI from Saccharomyces cerevisiae reveals 23 open reading 
7?T frames including the FAS1 gene."; 
:i£ . Yeast 9:1343-1348 (1993) • 

i7N [3j ;. 

SEQUENCE FROM N.A. 

Maia e Silva A., Bossier P., Vilela C, Fernandes L . , Soares H 
IU\ Guerreiro P., Rodrigues-Pousada C; 

RL Submitted (MAR-1994) to the EMBL/GenBank/DDBJ databases. 

RN [4] 

RP GENE NAME. 

RX MEDLINE=20157038; PubMed=10655212 ; 

pa Borland S., Deegenaars M.L., Stillman D.J.; 

S Zlls for the Saccharomyces cerevisiae SDS3, CBK1 and HYM1 genes m 

RT transcriptional repression by SIN3 . " ; 

RL Genetics' 154 :573-586 (2000) . 

CC -1- SIMILARITY: Belongs to the Mo25 family. 



CC 



-his SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinf ormatics and the EMBL out station - 
CC the European Bioinf ormatics Institute. There are no restrictions on its 
C- ' by non-profit institutions as long as its content is m no way 

cc modified and Lis statement is not removed. Usage by and for commercial 
CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
CC or send an email to license@isb- sib . ch) . 

CC 

DR EMBL; X6 97 65; CAA49422.1; -. 

DR EMBL; X74151; CAA5224 9.1; 

DR EMBL; Z28189; CAA82032.1; -. 



DR PIR; S34681; S34681. 
DR SGD; S0001672; HYM1 . 

GO; GO: 0005622; C : intracellular ; IDA. 

GO- GO: 0016564; F : transcriptional repressor actxvityMP^ 
GO- GO: 0007109; P : cytokinesis , completion of separation, IMP 
GO; 00:0008360; Prregulation of cell shape; IGI . 



DR 
DR 
DR 
DR 



DR InterPro; IPR004892; Mo25. 



DR 

SQ SEQUENCE 



GY 



Query Match 28.5%; Score 485; DB 1; Length 399; 

Best Local Similarity 33 . 0%; Pred. *>;*;* e -^ 

Matches 113; Conservative 75; Mismatches 138, lndeis , 

7 FSKSHKNPAEIVK'ILKDNLAILEK- - - -QDKKTD^SEEVSKSLQAMKEILCGTNEKEPP 62 
. 16 WKKNPKTPSDYARLIIEQLNKFSSPSLTQDIinCR-KVQEECT 

qv a TEAV^QLAQELYSSGLLVTLIADLQLn™ 122 ' 

Db 7E pUvDEiyTAMHRADVFYEiLi-HB'VDtEUARRECMLlFSlC^YSKDNKEVTVUYLVSg 134 

, v PHILFMLLKGYE- -. ^^^^^^^^^^^^^^^^^^^^^^'^^^^ ^ 

; b , PKTI SLMLRTAEVALQQKGCQDI FLTVGNMI IECIKYEQLCRI IIiKDPQLWKPFEFAKLG 194:. 

- TFDIASDAFATFKDLLTRHKVLVA-DFL- -EQNYDTIFEDYEKLLQSENYyr^QSLKLL 232 

i55 »JFEISTESLQIL'SAAF : TAHPKLVSKEFFSNEINIIR 254 

3 3 G 3Ij I LDRRNFAIMTKY I SKPENLKLMMNLLRDKS PN IQFEAFHVFKVFV AS PHljCTQP I VE 2 92 
>'-? ' || | | I . i j I • I I 1 I I I : I : I | | : ! I I I : I I II I I : I I ' • I " ' 

25r , jVoLIVTRSNNALMNIYINSPEMl 314 

•,.93 ILLKNQPKLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLKK 334 

II I ' - I I • • • I : : I I H = : : : : : I I : 
315 livKNRDKLLTYFKTFGLD-SQDSTFLDEREFIVQEIDSLPR 355 



Db 



RESULT 14 
M02L CAEEL 

TP ~M02L_CAEEL STANDARD; PR*; 33* AA. 

AC Q9TZM2 ; 

DT '6-OCT-2001 '(Rel. 40, Created) 
UT : :-6-OCT-2001 (Rel . 40, Last sequence update) 
PT -.' 6 .orT-2001 (Rel. 40, Last annotation update) 
DE Hypothetical M025-like protein T27C10.3 xn chromosome I. 
GN T27C10.3 . 
OS 

oc 



ZSSSgZgSTL*** Creadon Rhabdoid,, KhaMitoidea. 

OC Pliabditidae; Peloderinae; Caenorhabdrt is . 

OX 1TCBI _TaxID= 62 3 9 ; 

RN [11 

RP SEQUENCE FROM N.A. 

RC 5TRAIN=Bristol N2 ; 

pa ^hn H J Graves T . , Hawkins M . ; 

It submitted (OCT-1998) to the EMBL/GenBank/DDBJ databases. 



cr SWISS-PROT entry is copyright. It is produced through a collaborate 

CC between the Swiss Institute of Bioinf ormatics and the EMBL out Stat ion - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as . its content is m no way 

cc Wified and ?his statement is not removed. Usage by and for commercial 

CO entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 



EMBL; AFO 98504; AAC67411.1; 



CC 
DR 

DR PIR: T33477; T33477. 

DR WormPep; T27C10.3; CE19605. 

DR InterPro; IPR004892; Mo25 . 

DR Pfam; PF03204; Mo25; 1. 

KW Hypothetical protein. 

3Q SEQUENCE 339 AA; 40232 MW; E7DA45CA33F2947E CRC64 ; 

Ouery Match 8.4%; Score 143.5; DB 1; Length 339; 

Best Local Similarity 19.3%; Pred. No. 0.02; _ - _ 

Matches 38; Conservative 50; Mismatches 76; Indels ,3, Gaps 

-i c q IXjFS'IQFRDFFKYVELSTFDIASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQ 218 

• • -"i ■ 1 1 1 I h I I : : : '■ I : : 1 : 1 1 : 

, oh , 00 LMNTNKFRD FiWI QGTFDTLQ 1 1 FFTliHES ANNF I KNNLPRFMQTLHKL I A 150 

>!•', S EN YVTKRQSLKLLGEL I LDRHNFAIMTKY LSKPENLKLMMNLLRDKS PNI QFEAFHVFK 27 8 

| . . . I I I I I : | : :::::! : i | : = ■ ■ ■ : I : : 

CS NFFIQAK3FKFLNEI:FTAQTNYETRSLWMAEPAFIKI.WLAIQSNKHAVRSRAVSILE 210: 

,70 VFVASPHKTQPIVEILLKNQPKLIEFL- SSFQKERTDDEQFAD- 32.0 

'•' . j . -I. • : | : . | : !| I I =11 ! " I 5 I 

r,;, 2 11 1 F TRNPRNSPEVHEF IGRNRNVL I AFFFNS AP IHYYQGS PNEKE - - - D AQYARM AYKLLN 267 

Q y 3.21 -— EKNYLIKQIRDLKK 3 34 

Db 26 8 WDMQRPFTQEQLQDFEE 2 84 



KHSti.iiT 15 . . . 

AKA9 HUMAN 

TO ~-\KA9 HUMAN STANDARD ; PRT ; j 9 1 1 AA . 

AC $99996; 014869; Q43355; 094895; Q9UQH3 ; Q9UQQ4 ; Q9Y6B8 ; Q9Y6Y2 ; 

PT 16 -.OCT- 2 001 (Rel. 40, Created) 

DT 16 OCT 2 001 (Rel. 40, Last sequence update) 

DT 15-SEP-2003 (Rel. 42, Last annotation update) 

DE V-vinase anchor protein 9 (Protein kinase A anchoring protein 9) , 

DE tpRKA9) (A-kinase anchor protein 450 kDa) (AKAP 450) (A-kmase anchor. 

II orotein 350 kDa) (AKAP 3 50) (hgAKAP 350) (AKAP 120 like protein 

DE (Hyperion protein) (Yotiao protein) (Centrosome- and Golgi-localized 

PE -associated protein) (CG-NAP)- . 

GN AXAP9 OR AKAP450 OR AKAP350 OR KIAAC803 . 

OS Ho'^'O sapiens (Human) . . . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
OC Mammalia; Eutheria; Primates; Catarrhini; Hommidae; Homo. 
OX NCEI_TaxID=9606; 

RN [.I]' 



RA 
RT 
RT 

RT NR1 . " 



RP SEQUENCE FROM N.A. (ISOFORM 4) . 
RC TISSUE-Brain; 

RX ^T^l^^T^^^ • seaxoc, R • « Kim J.U. , 

"Yotiao a novel protein of neuromuscular junction and brain that 
interacts with specific splice variants of NMDA receptor subunxt 
NRl . " ; 

RL J. Neurosci. 18:2017-2027(1998), 

RP SEQUENCE FROM N.A. (ISOFORM 2) , AND VARIANT GLN-1347 INS. 

r>y MEDLINE=99219864 ; PubMed=10202149 ; 

X w<™ O., skaalhegg B.S.. Keryer Bornens M. , Tasken K. , 

S ""'.^SrctirLition of , =»» encoding an A-.inase anchoring 

RT protein located in the centrosome, A.KAP4 5 0 . " ; 

RL EMBO J. 18:1858-1868(1999). 

RN [3] 

RP SEQUENCE. FROM N.A. (ISOFORM 3) . 

RC TISSUE=Brain; 

nv MTrm.TNF-99287934; PubMed=10358086 ; _ 

RX NEDLINb-^-io^J** ohHm^kawa M Miyamoto M. , Mukai H. , One Y.; 

* ^s^^rs*."^ 1 ^-"^ that 

anchors multiple signaling enzymes to centrosome and the oolgr 

apparatus . " ; . .. .' 

PL ~f. Biol. Chem. 274:17267-17274(1999). 



RT 

• RT 



RN 



•Schmidt P.H.. Dranstieia d.i., < 

T^t-ter K W Milgram S.L., Goldenrmg J.R. ; 

50 a 'multiply spliced protein kinase A-anchonng protern 



SEQUENCE FROM N.A. (ISOFORM 1) . 

RA Kemmner W.A. , Deiss S. , Schwarz U. ; 

pt ^Cloninq of Hyperion.'' ; 

kI Submitted (AUG 1998) to the EMBL/GenBank/DDBJ databases. 

IS SEQUENCE OF 323-3911 FROM N.A. (ISOFORM 2) . 

EC TTSSUE=Gastric parietal cell; 

ry MEDLINE- 9 9115654; PubMed=9915845 ; 

£ Schmidt P.H., Dransfield n .T . , _ Claudio J.O. , Hawrey R.0.. 

RA 
RT 

RT associated with centrosomes . " ; 

RL J ■ Biol. Chem. 274:3055-3 066(1999). 

RP SEQUENCE OF 1802-3876 FROM N.A. (ISOFORM 5) . 

RC TISSUE-Lymphoblast; , 

™ -.v'-s'-^q V sutterer C, Becker M. , Hawkins M. ; 
^ submitted (JAN!l998) to the EMBL/GenBank/DDBJ databases. 

SEQUENCE OF 2157-3911 FROM N.A. (ISOFORM 6) . 
TI3SUE=Lung; 

RA Milgram S.L., ^^^^liced" f amn"of proteins with centrosomal 
RT "AKAP350: A multiply splicea idimiy UL F 

RL Submrttea°(SEP-1998) to the EMBL/GenBank/DDBJ databases. 

RP SEQUENCE OF 2212-3911 FROM N.A. (ISOFORM 2/3) . 
RC TISSUE=*Brain; 

pv MSDLINE=99087487; PubMed=9872452 ; _ 

K Nagase T., Ishikawa K.-I.. Suyama M. , Kikuno R. , Miya 3 ima N. , 

RA Tanaka A., Kotani H. , Nomura N., Ohara O. ; 



RL 
EH [71 

RC 



RT 
RT 
RT 



■ ■ 1= nnrfina seauences of unidentified human genes. XI- 



for larqe proteins in vitro 
RL DNA Res. 5:277-236(1998) . 
RN 19] 

RP SEQUENCE OF 17-1800 FROM N.A 



RL submitted ASBP-1998) to he ^ SUBUNITS OF PROTEIN KINASE 

cc ... ™ CTI0N - T ^^ S p ^ T ^^ E T ^ T R ASSEMB LES SEVERAL PROTEIN KINASES AND 

cc A. S C AF ^LD c nu CENTRO SOME AND GOLG I APPARATUS WHERE PHYSIOLOGICAL 

CC PHOSPHATASES ON CENTRO SOME ^ STATE OF PROTEIN 

CC EVENTS CAN BE REGULATED BY PHU THE N-METHYL-D- 

cc SUBSTRATES. ISOFORM ^^^^ll FOUND IN THE NEUROMUSCULAR 

CC C ££ P " AS IN NEURONAL SYNAPSES EXPLAINING THAT ITS 

C C C SSTS BE TO ORGANIZE ^S^^™^ KINASE N 

f c - ! " ^x T pR™^^ -Tf 1 ^™ 1 (ppl) 

THE IMMATURE NON - PHOS PHORYLATED FORM OF PKC EPSILON . 

f C SUBCELLULAI^LOCATION : CENTROSOMAL IN MANY CELL TYPES AND 

CC ' CYTOPLASMIC IN PARIETAL CELLS. 

CC - i - ALTERNATIVE PRODUCTS : 

CC- ' Event=Alternative splicing; Named isoforms=6; 

rv- Name=l; , 

r: i~oId=Q99996-l; 3equenca=Di sprayed ; 

^^099996-2; SequeAce-VSPJ>04 102 , VSP. 004107 ; . 

?i ^^^^^^.00^102.. VSP_004105, VSP _0 04 10 7 ; 

| ^S^e.VSPJOUOa, VSP.00410*; 

r: ^01^099996-5; Sequ £ uice=VSP_004108; 

■V. ' ^-=6; ^r^ A seauence=VSP 004106, VSP 004107, VSP_0041C9; 

T, , XI S SPECIFICITY WIDELY EXPRESSED. ISOFORM 4/YOTIAO IS HIGHLY 

- " ' ' *"^*'££rs™ TO^FORM^AN AMPHIPATHIC HELIX, 

! " 1 f^OULD^ PARTI CI PATE IN PROTEIN - PROTEIN INTERACTIONS WITH A 

ESS^" SVx^^U SHOWN DUE TO FOUR 
F RAMESHI FTS IN POSITIONS 29, 1653, 1699 AND 173,. 



CC 

cc 



cc 

cc 

CC 



■ - ^nnvriaht It is produced through a collaboration 
CC and the EMBL out station - 

CC between the Swiss Institute ol Bioinr restrictions on its 

" the European -~^^T.o^ Z^^Z is in no way 
use _ by non-prof ^ institution ^ Usage by and for commercial. 



Ube uy r--- t is not removed. Usage by ana ror . 

^S'iS a license agreement (See http: //www. isb-sib.cn/announce/ 



CC 

cc 

cc 

CC or send an email to license@isb-sib.ch) 

cc 

DR EMBL; AJ131693; CAB40713.1; -• 

DR EMBL; AB019691; BAA78718.1; -. 

DR EMBL ; AJ010770; CAA09361.1; -. 

DR EMBL; AF026245; AAB86384.1; -. 



DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 

DR 



EMBL; AF083037; AAD22767.1; 

EMBL; AC004013; AAB96867.1; ALT_JFRAME . 

EMBL; AF091711; AAD39719-1; -. 

EMBL; AB018346; BAA34523.1; -. 

EMBL; AC000066; AAC60380.1; ALT_FRAME . 

Genew; HGNC:37 9; AKAP9 . 

MIM; 604001; -. 

GO; GO: 0005813; C : centrosome ; TAS . 
rn- HO -0005856; C : cytoskeleton; TAS. 

GO'- GO 0004973; F : N-methyl -D- aspartate receptor-associated pr. 

GO- GO: 0005515; F:protein binding activity; TAS . 

GO- GO:0007165; P:signal transduction; TAS. 

GO- GO -0006832; P: small molecule transport; IAS. 
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LUJ-iiy/ ^ wj. j inw-i- 

PKA-RH SUBUNIT BINDING DOMAIN . 
COILED COIL (POTENTIAL) 
COILED COIL (POTENTIAL) 
COILED COIL (POTENTIAL) 
COILED COIL (POTENTIAL) 
COILED COIL (POTENTIAL) 
COILED COIL (POTENTIAL) 
COILED COIL (POTENTIAL) 
COILED COIL (POTENTIAL) 

COILED COIL (POTENTIAL) 

COILED COIL (POTENTIAL) 

COILED COIL (POTENTIAL) 

COILED COIL (POTENTIAL) 

COILED COIL (POTENTIAL) 

POLY -LEU. '[ •' 

GLN-RICH. ' 

GLU-RICH . 

GLU-RICH. ' 

Missing (in isoform 2 and isoform 3) . 
/FTId=VSP__004102 . " ' . 

QLQEEI -> LATRRD (in isoform 4) . 
/FTId=VSP_004103 . 
Missing (in isoform 4) . 
/FTId=VSP_0041O4 . 
Missing (in isoform 3) . 
/FTId=VSP_,_004105 . 
SADTFQKVE -> Q (in isoform 5) . 
/FTId=V3P_004T06 . 

VFGFYNMCFSTLC -> GS S I PELAHSDAYQTRE I CS S 
(in isoform 2, isoform 3 and isoform 6) . 
/FTId=V3P_0 04107 . 
Missing (in isoform 5) . 
/FTId=VSP_004108 . 

STTQFHAGMRR -> ALSLTTSWQHHSARPTAPLFFE ILSH 
SLG (in isoform 6) . 
/FTId=VSP_004109. 
K -> KQ. 

/ FT Id= VAR_0 1 0 9 2 6 . 
E -> Q (IN REF. 3) . 
M -> I (IN REF. 3) . 
E -> G (IN REF. 3) . 
R -> S (IN REF. 3) . 



eei fifi-l N -> S (IN REF. 3) . 

FT CONFLICT 663 663 

FT CONFLICT 913 913 H _^ ^ ^ ^ 

FT CONFLICT 956 956 ^ ^ _ , ^ 2) . 

FT CONFLICT 980 982 q - > p (IN REF . 1 AND 2 ) . 

FT CONFLICT 997 997 U 2 

PT CONFLICT 1001 1001 Q > P "J } 

FX CONFLICT 1020 1020 N -, D N REF. 3 - 

FT CONFLICT 1028 1028 V ^ x ^ 2) 

FT CONFLICT 1626 1626 

FT CONFLICT 1703 1703 N ^ ^ 

FT SSSS SS -3 MISSING (IN REF. 5) . 

FT CONFLICT 1843 1843 "> P (IN " 3) ' 

,. „ . 7 5%; Score 12 8.5; DB 1; Length 3 911; 

Query hatch 3 . 

^chr a \?; M ~a^ »6, mdels 115; Caps 15; 

L S VKI LKDNLAI LEKQDKKTDKAS EEVS KS LQAMKE I LCGTNEKE PPTE AVAQL AQEL Y S SG 77 

1 S64 xEKLKDNLGIHYKQ- -QIDGLQNEMSQKIETMQ- -FEKDNLITKQNQLILE- 710 
.7. LLVTLIADLQ- -LIDFEGKKDVTQIFNNILRRQI ■" -GTRSPTVEYISAHPHI 125 
711 - - ISKLKDLQOSLWSKSEEMTLQI - -NELQKEIEILRQEEKEKGTLF.QEVQELQLKTEL 766 

2 X26 ,FHLLKGYEAPQIALRCGIMLRECIRHEPLAKIILF S NQFRDFFKYyELSTFDT,SDAFA 185 

! : l I- _ • NDLOEKFAQLEAEN - S XLKDEKK 7 97 

7 67 JjEKQMKEKE- 

" rr . , 8S XFKDLLTRH KVLVADFLE-QNYDTIFEDYEKLLQSENYVTKRQSLKLLGEL IL 237 

l V98 TLEDMLKIHTPVSQEERLIFLDSIKSKSKD 857 

Y I S KPENLKLMMNLLRD 264 

Qy 230 DRHNFAIMTK """ | | | . | :: | | 

058 QRNTFSFAEKlsIFEWYQELQEEYACLLKVKDDLEDSKNKQELEYKSKLKALNEELHLQRI 917 

,J5 KSPNIQFEA- -FHVFKVFVASPHKTQPIVEILL^QPKLIEFLSSFQKSRTD-DEQFAD- 320" 

9 18 NPTTVKMKSSVFDEDKTFVA ETLEMGEVVEICDTTELMEKLEVTKREKLELSQRLSDL 974 

..-,1 EKNYLIKQIRDLKK 334 

" y I ::| :==: lh 

Ob 57 5 SEQLKQKHGEISFLNEEVKSLKQ 997 
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Job time : 20 sees 



